Do We Really Need to Collect Millions of Faces for Effective Face Recognition?

Face recognition capabilities have recently made extraordinary leaps. Though this progress is at least partially due to ballooning training set sizes – huge numbers of face images downloaded and labeled for identity – it is not clear if the formidable tas

PDF / 1,689,363 Bytes
18 Pages / 439.37 x 666.142 pts Page_size
80 Downloads / 273 Views

DOWNLOAD

REPORT

itute for Robotics and Intelligent Systems, USC, Los Angeles, CA, USA {iacopo.masi,anhttran,leksut,medioni}@usc.edu 2 Information Sciences Institute, USC, Los Angeles, CA, USA [email protected] 3 The Open University of Israel, Ra’anana, Israel

Abstract. Face recognition capabilities have recently made extraordinary leaps. Though this progress is at least partially due to ballooning training set sizes – huge numbers of face images downloaded and labeled for identity – it is not clear if the formidable task of collecting so many images is truly necessary. We propose a far more accessible means of increasing training data sizes for face recognition systems: Domain specific data augmentation. We describe novel methods of enriching an existing dataset with important facial appearance variations by manipulating the faces it contains. This synthesis is also used when matching query images represented by standard convolutional neural networks. The eﬀect of training and testing with synthesized images is tested on the LFW and IJB-A (veriﬁcation and identiﬁcation) benchmarks and Janus CS2. The performances obtained by our approach match state of the art results reported by systems trained on millions of downloaded images.

1

Introduction

The recent impact of deep Convolutional Neural Network (CNN) based methods on machine face recognition capabilities has been extraordinary. The conditions under which faces are now recognized and the numbers of faces which systems can now learn to identify improved to the point where some consider machines to be better than humans at this task. This progress is partially due to the introduction of new and improved network designs. However, alongside developments in network architectures, it is also the underlying ability of CNNs to learn from massive training sets that allows these techniques to be so eﬀective. Realizing that eﬀective CNNs can be made even more eﬀective by increasing their training data, many began focusing eﬀorts on harvesting and labeling large image collections to better train their networks. In [39], a standard CNN was trained by Facebook using 4.4 million labeled faces and shown to achieve what was, at the time, state of the art performance on the Labeled Faces in the I. Masi, A. Tu´ an Tr` ˆ an and T. Hassner are equally contributed. ˆ c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part V, LNCS 9909, pp. 579–596, 2016. DOI: 10.1007/978-3-319-46454-1 35

I. Masi et al.

Dataset CASIA [46] Facebook DeepFace [39] Google FaceNet [33] VGG Face [28] Facebook Fusion [40] MegaFace [14] Aug. pose+shape Aug. pose+shape+expr

#ID 10,575 4,030 8M 2,622 500M 690,572 10,575 10,575

#Img #Img/#ID 494,414 46 4.4M 1K 200M 25 2.6M 1K 10M 50 1.02M 1.5 1,977,656 187 2,472,070 234

(a) Face set statistics

6000

Images

580

CASIA WebFace Pose with Shapes Pose, Shapes, Expression

4000

2000

0 0 10

10

1

2

3

10 10 Subjects (log scale)

10

4

5

10

(b) Images for subjects

Fig. 1. (a) Comparison of our augmented dataset with other face datasets along

Data Loading...

Do We Really Need to Collect Millions of Faces for Effective Face Recognition?

Recommend Documents

Do We Really Need Traditional Usability Lab for UX Practice?

Do we really need three-dimensional convex inguinal hernia meshes?

Pulmonary vein atrial tachycardia: do we really need to isolate or freeze?

What is it that we really do?

What Kind of Modernization Do We Need?

Why do we need to revisit the Cold War?

What do we need to Probe Upper Ocean Stratification Remotely?

What we need to know, where we need to go

High-Resolution Sonars: What Resolution Do We Need for Target Recognition?

Do We Need HR? Repositioning People Management for Success

Do we need special viewpoints for intraoperative frozen section diagnosis?

Improving Face Recognition by Clustering Unlabeled Faces in the Wild