Joint Face Detection and Alignment with a Deformable Hough Transform Model

We propose a method for joint face detection and alignment in unconstrained images and videos. Historically, these problems have been addressed disjointly in literature with the overall performance of the whole pipeline having been scantily assessed. We s

PDF / 2,504,541 Bytes
12 Pages / 439.37 x 666.142 pts Page_size
2 Downloads / 245 Views

DOWNLOAD

REPORT

Abstract. We propose a method for joint face detection and alignment in unconstrained images and videos. Historically, these problems have been addressed disjointly in literature with the overall performance of the whole pipeline having been scantily assessed. We show that a pipeline built by combining state-of-the-art methods for both tasks produces unsatisfactory overall performance. To address this limitation, we propose an approach that addresses both tasks, which we call Deformable Hough Transform Model (DHTM). In particular, we make the following contributions: (a) Rather than scanning the image with discriminatively trained ﬁlters, we propose to employ cascaded regression in a sliding window fashion to ﬁt a facial deformable model over the whole image/video. (b) We propose to capitalize on the large basin of attraction of cascaded regression to set up a Hough-Transform voting scheme for detecting faces and ﬁltering out irrelevant background. (c) We report state-of-the-art performance on the most challenging and widely-used data sets for face detection, alignment and tracking.

Keywords: Face detection sion · Hough Transform

1

·

Alignment

·

Tracking

·

Cascaded regres-

Introduction

From Viola and Jones [1] to Deformable Part Models [2–4] and from Active Appearance Models [5] to Cascaded Regression [6–9], face detection, alignment and tracking have all witnessed tremendous progress over the last years. Besides new methodologies, another notable development in the ﬁeld has been the collection and annotation of large facial data sets captured in-the-wild [3,10–13], for which a number of newly developed methods have been shown to produce remarkable results. Despite the progress in the ﬁeld, the majority of prior work has disjointly considered the two problems: there is a large number of papers on face detection and perhaps even a larger number of papers on face alignment and tracking, but to the best of our knowledge there are only two papers [3,14] that study the combined problem of detection and alignment and no method that addresses and evaluates all three tasks jointly. However, for many subsequent, higher level c Springer International Publishing Switzerland 2016 G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part II, LNCS 9914, pp. 569–580, 2016. DOI: 10.1007/978-3-319-48881-3 39

570

J. McDonagh and G. Tzimiropoulos

tasks, like face recognition, facial expression and attribute analysis, what matters is the overall performance in terms of accuracy in landmark localization. Notably, recent state-of-the-art methods for such tasks heavily rely on the accurate detection of landmarks (see for example [15,16]). As we show hereafter, the overall performance in landmark localization accuracy might be unsatisfactory even by putting two recently proposed state-ofthe-art methods (we used [4] for face detection and [9] for landmark localization) together. The reason for this is that face detection follows object detection in terms of measuring performance and, in particular, it uses the PASCAL VOC prec

Data Loading...

Joint Face Detection and Alignment with a Deformable Hough Transform Model

Recommend Documents

Joint Face Alignment and 3D Face Reconstruction

Circular Hough Transform

Vanishing Point Detection in the Hough Transform Space

Deep Hough-Transform Line Priors

Face Alignment

Robust Sparse Component Analysis Based on a Generalized Hough Transform

A Modified Joint Geometrical and Statistical Alignment Approach for Low-Resolution Face Recognition

Hough Transform Voting Scheme for Detection of Parabolas and Open Conics in Images

Face Alignment Error

Face Detection with End-to-End Integration of a ConvNet and a 3D Model

3D Face Alignment Without Correspondences

Deformable Part Model Based Hand Detection against Complex Backgrounds