Real-Time 3D Face Tracking with Mutual Information and Active Contours

We present a markerless real-time, model-based 3D face tracking methodology. The system combines two robust and complimentary op-timization-based strategies, namely active contours and mutual information template matching, in order to obtain real-time per

  • PDF / 767,334 Bytes
  • 12 Pages / 430 x 660 pts Page_size
  • 23 Downloads / 163 Views

DOWNLOAD

REPORT


Abstract. We present a markerless real-time, model-based 3D face tracking methodology. The system combines two robust and complimentary op-timization-based strategies, namely active contours and mutual information template matching, in order to obtain real-time performances for full 6dof tracking. First, robust head contour estimation is realized by means of the Contracting Curve Density algorithm, effectively employing local color statistics separation for contour shape optimization. Afterwards, the 3D face template is robustly matched to the underlying image, through fast mutual information optimization. Off-line model building is done using a fast modeling procedure, providing a unique appearance model for each user. Re-initialization criteria are employed in order to obtain a complete and autonomous tracking system. Keywords: Real-time Face Tracking, Nonlinear Optimization, 3D Template Matching, Mutual Information, Active Contours, Local Statistics Contour Matching, 3D Face Modeling.

1

Introduction

Real-time 3D face tracking is an important problem in computer vision. Several approaches have been proposed and developed with different model definitions, concerning shape, degrees of freedom, use of multiple appearance/shading templates, use of natural facial features, etc.; a careful choice of model complexity is a critical issue for real-time tracking, often forcing to resort to approximate solutions, in terms of the output provided to the end user or to subsequent processing modules. The focus of this paper concerns fast and reliable 6dof face pose estimation and tracking in real-time, based on a hierarchical integration of two robust highlevel visual modalities. In order to motivate our approach, we first consider here related state-of-the-art methodologies, from the available literature on the subject. The system proposed in [1] employs a robust multi-layer fusion of different visual cues in the hierarchical framework named IFA [2], proceeding from coarse to accurate visual modalities, and providing the result from the top-level tracker as output; in this work, simple template and feature point models are used at the high levels. G. Bebis et al. (Eds.): ISVC 2007, Part I, LNCS 4841, pp. 1–12, 2007. c Springer-Verlag Berlin Heidelberg 2007 

2

G. Panin and A. Knoll

Other approaches, using natural face features detection and tracking in a monocular setting, have been presented in [3][4]. In paticular, in [3] a 3D face model is fitted by matching features across subsequent frames, with an approach combining RANSAC [5] and Particle Filters [6] under frame-to-frame epipolar constraints. [4] combines on-line and off-line information by matching local features and optimizing a robust least-squares global cost function. Although this approach can provide a better stability, precision and speed, due to the use of local optimization techniques, the joint use of online and offline information may pose additional choices, concerning the overall cost function parameters, and the models required for tracking. Template-based a