Robust Face Alignment Using a Mixture of Invariant Experts

Face alignment, which is the task of finding the locations of a set of facial landmark points in an image of a face, is useful in widespread application areas. Face alignment is particularly challenging when there are large variations in pose (in-plane an

  • PDF / 2,158,627 Bytes
  • 17 Pages / 439.37 x 666.142 pts Page_size
  • 89 Downloads / 193 Views

DOWNLOAD

REPORT


Mitsubishi Electric Research Labs (MERL), Cambridge, USA [email protected], [email protected] 2 Intel Corporation, Santa Clara, USA [email protected]

Abstract. Face alignment, which is the task of finding the locations of a set of facial landmark points in an image of a face, is useful in widespread application areas. Face alignment is particularly challenging when there are large variations in pose (in-plane and out-of-plane rotations) and facial expression. To address this issue, we propose a cascade in which each stage consists of a mixture of regression experts. Each expert learns a customized regression model that is specialized to a different subset of the joint space of pose and expressions. The system is invariant to a predefined class of transformations (e.g., affine), because the input is transformed to match each expert’s prototype shape before the regression is applied. We also present a method to include deformation constraints within the discriminative alignment framework, which makes our algorithm more robust. Our algorithm significantly outperforms previous methods on publicly available face alignment datasets.

1

Introduction

Face alignment refers to finding the pixel locations of a set of predefined facial landmark points (e.g., eye and mouth corners) in an input face image. It is important for many applications such as human-machine interaction, videoconferencing, gaming, and animation, as well as numerous computer vision tasks including face recognition, face tracking, pose estimation, and expression synthesis. Face alignment is difficult due to large variations in factors such as pose, expression, illumination, and occlusion. 1.1

Previous Work

Great strides have been made in the field of face alignment since the Active Shape Model (ASM) [1] and Active Appearance Model (AAM) [2] were first proposed. AAM-based face alignment methods proposed since then include [3–5]. To handle wider variations in pose, multi-view AAM and ASM models [6–8] explicitly Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46454-1 50) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016  B. Leibe et al. (Eds.): ECCV 2016, Part V, LNCS 9909, pp. 825–841, 2016. DOI: 10.1007/978-3-319-46454-1 50

826

O. Tuzel et al.

model and predict the head pose, e.g., by learning a different deformable model for each of several specific pose ranges [7,8]. Another line of research involves multi-camera AAMs, in which an AAM is simultaneously fitted to images of a face captured by multiple cameras [9,10]. Like ASMs and AAMs, Constrained Local Models (CLMs) [11–14] have explicit joint constraints on the landmark point locations (e.g., a subspace shape model) that constrain the positions of the landmarks with respect to each other. Building on CLMs, [15] propose the Gauss-Newton Deformable Part Model (GN-DPM), which uses Gauss-Newton optimization to jointly fit an appearance model and a global shape model. Recently, much of the focus in fa