Face Tracking in the Compressed Domain

  • PDF / 1,850,248 Bytes
  • 11 Pages / 600.03 x 792 pts Page_size
  • 95 Downloads / 199 Views

DOWNLOAD

REPORT


Face Tracking in the Compressed Domain Pedro Miguel Fonseca and Jan Nesvadba Philips Research, 5656AA Eindhoven, The Netherlands Received 30 August 2004; Revised 23 March 2005; Accepted 4 May 2005 A compressed domain generic object tracking algorithm offers, in combination with a face detection algorithm, a low-computational-cost solution to the problem of detecting and locating faces in frames of compressed video sequences (such as MPEG-1 or MPEG-2). Objects such as faces can thus be tracked through a compressed video stream using motion information provided by existing forward and backward motion vectors. The described solution requires only low computational resources on CE devices and offers at one and the same time sufficiently good location rates. Copyright © 2006 P. M. Fonseca and J. Nesvadba. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1.

INTRODUCTION

The problem of tracking objects over time is a complex one in computer vision and has been an important topic of research over the last few years. Such importance comes from the fact that object tracking enables important applications in areas such as security and surveillance (e.g., tracking people in restricted areas using security cameras), content management (e.g., in video abstraction to automatically annotate video content), content improvement (e.g., helping stabilize images in handheld mobile videophones by tracking the location of faces), human-machine interface (e.g., to automatically recognize hand gestures to automatically execute commands), interactive gaming, and so forth. Requirement constraints such as reliability and computational complexity characterize the boundary conditions for a successful and target-platform-suited solution. The detection and spatial localization of objects, in particular faces, has been broadly investigated [1, 2]. While tracking identified objects throughout uncompressed video sequences, the objects’ spatial properties may be used (e.g., colour, shape, texture, etc.) since it can be expected that they will vary a little from frame to frame. The information is thus represented in a way suited to easily track the objects. However, in compressed video sequences (such as MPEG-1 or MPEG-2), available information may not express directly the objects’ spatial properties and thus, renders the tracking procedure more difficult. In addition, the type of information that is available actually varies from frame to frame— for example MPEG-1 or MPEG-2 video sequences are typically comprised of I-, P-, and B-frames, each with its own

set of parameters. In this paper we describe an object tracking solution that uses only compressed parameters available in MPEG-1 or MPEG-2 video sequences while performing only the minimal decoding necessary to retrieve them from the compressed video streams. Few algorithms exist that are able to perform object tracking in the compressed d