Benchmarking deep network architectures for ethnicity recognition using a new large face dataset

PDF / 1,375,269 Bytes
13 Pages / 595.276 x 790.866 pts Page_size
104 Downloads / 222 Views

ORIGINAL PAPER

Benchmarking deep network architectures for ethnicity recognition using a new large face dataset Antonio Greco1

· Gennaro Percannella1 · Mario Vento1 · Vincenzo Vigilante1

Received: 25 November 2019 / Revised: 10 June 2020 / Accepted: 1 September 2020 / Published online: 14 September 2020 © The Author(s) 2020

Abstract Although in recent years we have witnessed an explosion of the scientific research in the recognition of facial soft biometrics such as gender, age and expression with deep neural networks, the recognition of ethnicity has not received the same attention from the scientific community. The growth of this field is hindered by two related factors: on the one hand, the absence of a dataset sufficiently large and representative does not allow an effective training of convolutional neural networks for the recognition of ethnicity; on the other hand, the collection of new ethnicity datasets is far from simple and must be carried out manually by humans trained to recognize the basic ethnicity groups using the somatic facial features. To fill this gap in the facial soft biometrics analysis, we propose the VGGFace2 Mivia Ethnicity Recognition (VMER) dataset, composed by more than 3,000,000 face images annotated with 4 ethnicity categories, namely African American, East Asian, Caucasian Latin and Asian Indian. The final annotations are obtained with a protocol which requires the opinion of three people belonging to different ethnicities, in order to avoid the bias introduced by the well-known other race effect. In addition, we carry out a comprehensive performance analysis of popular deep network architectures, namely VGG-16, VGG-Face, ResNet-50 and MobileNet v2. Finally, we perform a cross-dataset evaluation to demonstrate that the deep network architectures trained with VMER generalize on different test sets better than the same models trained on the largest ethnicity dataset available so far. The ethnicity labels of the VMER dataset and the code used for the experiments are available upon request at https://mivia. unisa.it. Keywords Ethnicity recognition · Face analysis · Soft biometrics · Dataset · Deep learning · Benchmark

1 Introduction The face is the part of the human body that contains most of the semantic information about an individual; the so-called facial soft biometrics, namely identity, gender, age, ethnicity, expression, have attracted in recent years the attention of the pattern recognition community thanks to the huge amount of possible applications in retail and video surveil-

B

Antonio Greco [email protected] Gennaro Percannella [email protected] Mario Vento [email protected] Vincenzo Vigilante [email protected]

1

Department of Information and Electrical Engineering and Applied Mathematics, University of Salerno, Fisciano, Italy

lance and to the intrinsic difficulty of designing effective and reliable algorithms in the challenging real-world scenarios. This trend is confirmed by the large amount of papers [10] describing the use of modern convolutional neural networks (CN

Data Loading...

Benchmarking deep network architectures for ethnicity recognition using a new large face dataset

Recommend Documents

Deep Learning Architectures for Face Recognition in Video Surveillance

Benchmarking deep neural network approaches for Indian Sign Language recognition

Face Recognition Using Siamese Network

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition

Exploring Deep Gradient Information for Face Recognition

Deep Attributes for One-Shot Face Recognition

Devanagari Handwritten Character Recognition using fine-tuned Deep Convolutional Neural Network on trivial dataset

A New Dataset and Evaluation for Infrared Action Recognition

A Discriminative Feature Learning Approach for Deep Face Recognition

A Deep Deformable Convolutional Method for Age-Invariant Face Recognition

A Performance Evaluation of Loss Functions for Deep Face Recognition

Deep Cascaded Bi-Network for Face Hallucination