Detection and Tracking of Humans and Faces
- PDF / 6,508,714 Bytes
- 10 Pages / 600.05 x 792 pts Page_size
- 30 Downloads / 211 Views
Research Article Detection and Tracking of Humans and Faces Stefan Karlsson, Murtaza Taj, and Andrea Cavallaro Multimedia and Vision Group, Queen Mary University of London, London E1 4NS, UK Correspondence should be addressed to Murtaza Taj, [email protected] Received 15 February 2007; Revised 14 July 2007; Accepted 25 November 2007 Recommended by Maja Pantic We present a video analysis framework that integrates prior knowledge in object tracking to automatically detect humans and faces, and can be used to generate abstract representations of video (key-objects and object trajectories). The analysis framework is based on the fusion of external knowledge, incorporated in a person and in a face classifier, and low-level features, clustered using temporal and spatial segmentation. Low-level features, namely, color and motion, are used as a reliability measure for the classification. The results of the classification are then integrated into a multitarget tracker based on a particle filter that uses color histograms and a zero-order motion model. The tracker uses efficient initialization and termination rules and updates the object model over time. We evaluate the proposed framework on standard datasets in terms of precision and accuracy of the detection and tracking results, and demonstrate the benefits of the integration of prior knowledge in the tracking process. Copyright © 2008 Stefan Karlsson et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1.
INTRODUCTION
Video filtering and abstraction are of paramount importance in advanced surveillance and multimedia database retrieval. The knowledge of the objects’ types and position helps in semantic scene interpretation, indexing video events, and mining large video collections. However, the annotation of a video in terms of its component objects is as good as the object detection and tracking algorithm that it is based upon. The quality of the detection and tracking algorithm depends in turn on its capability of localizing objects of interest (object categories) and on tracking them over time. It is in general difficult to define object categories for retrieval in video because of different meanings and definitions of objects in different applications. However, some categories of objects, such as people and faces, are of interest across several applications and provide relevant cues about the content of a video. Detecting and tracking people and faces provide significant semantic information about the video content for video summarization, intelligent video surveillance, video indexing, and retrieval. Moreover, the human visual system is particularly attracted by people and faces, and therefore their detection and tracking enable perceptual video coding [1]. A number of approaches have been proposed for the integration of object detectors in a tracking process. A stochastic model is implemented
Data Loading...