A survey on face data augmentation for the training of deep neural networks

  • PDF / 2,781,507 Bytes
  • 29 Pages / 595.276 x 790.866 pts Page_size
  • 53 Downloads / 199 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789(). ,- volV)

REVIEW

A survey on face data augmentation for the training of deep neural networks Xiang Wang1



Kai Wang1 • Shiguo Lian2

Received: 22 August 2019 / Accepted: 17 January 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract The quality and size of training set have a great impact on the results of deep learning-based face-related tasks. However, collecting and labeling adequate samples with high-quality and balanced distributions still remains a laborious and expensive work, and various data augmentation techniques have thus been widely used to enrich the training dataset. In this paper, we review the existing works of face data augmentation from the perspectives of the transformation types and methods, with the state-of-the-art approaches involved. Among all these approaches, we put the emphasis on the deep learning-based works, especially the generative adversarial networks which have been recognized as more powerful and effective tools in recent years. We present their principles, discuss the results and show their applications as well as limitations. Different evaluation metrics for evaluating these approaches are also introduced. We point out the challenges and opportunities in the field of face data augmentation and provide brief yet insightful discussions. Keywords Data augmentation  Face image transformation  Generative models

1 Introduction In recent years, face studies in computer vision have shifted from seeking engineering features by hand to using deep learning approaches. As a result, data play a more important role, since the performance of deep neural network heavily depends on the amount and quality of the training data. The remarkable work by Facebook [129] and Google [116] demonstrated the effectiveness of large-scale datasets on obtaining high-quality trained model and revealed that deep learning strongly relies on large and complex training sets to generalize well in unconstrained settings. However, collecting and labeling a large quantity of real samples is widely recognized as laborious,

expensive and error prone, and existing datasets are usually lack of balance in data distribution. To compensate the insufficient training data, domain adaptation transfers the knowledge learned from a source data distribution to a target data distribution [23–26]. However, it needs samples from target distributions before training and assumes the target distribution is fixed, which is restrictive in practical scenarios [138]. On the other hand, data augmentation provides an effective alternative, which we call ‘‘face data augmentation.’’ It is a technology to enlarge the data size of training or testing by transforming collected real face samples or simulated virtual face samples. Figure 1 shows a schematic diagram of face data augmentation, which is our focus in this paper. Assuming the original dataset is S, face data augmentation can be represented by the following mapping:

& Xiang Wang [email protected]

/ : S7!T

Kai Wang ka