SliderGAN: Synthesizing Expressive Face Images by Sliding 3D Blendshape Parameters

PDF / 27,973,817 Bytes
22 Pages / 595.276 x 790.866 pts Page_size
65 Downloads / 332 Views

SliderGAN: Synthesizing Expressive Face Images by Sliding 3D Blendshape Parameters Evangelos Ververas1

· Stefanos Zafeiriou1

Received: 15 May 2019 / Accepted: 10 May 2020 © The Author(s) 2020

Abstract Image-to-image (i2i) translation is the dense regression problem of learning how to transform an input image into an output using aligned image pairs. Remarkable progress has been made in i2i translation with the advent of deep convolutional neural networks and particular using the learning paradigm of generative adversarial networks (GANs). In the absence of paired images, i2i translation is tackled with one or multiple domain transformations (i.e., CycleGAN, StarGAN etc.). In this paper, we study the problem of image-to-image translation, under a set of continuous parameters that correspond to a model describing a physical process. In particular, we propose the SliderGAN which transforms an input face image into a new one according to the continuous values of a statistical blendshape model of facial motion. We show that it is possible to edit a facial image according to expression and speech blendshapes, using sliders that control the continuous values of the blendshape model. This provides much more flexibility in various tasks, including but not limited to face editing, expression transfer and face neutralisation, comparing to models based on discrete expressions or action units. Keywords GAN · Image translation · Facial expression synthesis · Speech synthesis · Blendshape models · Action units · 3DMM fitting · Relativistic discriminator · Emotionet · 4DFAB · LRW

1 Introduction Interactive editing of the expression of a face in an image has countless applications including but not limited to movies post-production, computational photography, face recognition (i.e. expression neutralisation) etc. In computer graphics facial motion editing is a popular field, nevertheless mainly revolves around constructing person-specific models having a lot of training samples (Suwajanakorn et al. 2017). Recently, the advent of machine learning, and especially Deep Convolutional Neural Networks (DCNNs) provide very exciting tools making the community to re-think the problem. In particular, recent advances in Generative Adver-

Communicated by Jun-Yan Zhu, Hongsheng Li, Eli Shechtman, Ming-Yu Liu, Jan Kautz, Antonio Torralba.

B

Evangelos Ververas [email protected] Stefanos Zafeiriou [email protected]

1

Department of Computing, Imperial College London, Queens Gate, London SW7 2AZ, UK

sarial Networks (GANs) provide very exciting solutions for image-to-image (i2i) translation. i2i translation, i.e. the problem of learning how to transform aligned image pairs, has attracted a lot of attention during the last few years (Isola et al. 2017; Zhu et al. 2017; Choi et al. 2018). The so-called pix2pix model and alternatives demonstrated excellent results in image completion etc. (Isola et al. 2017). In order to perform i2i translation in absence of image pairs the so-called CycleGAN was proposed, which introduced a

SliderGAN: Synthesizing Expressive Face Images by Sliding 3D Blendshape Parameters

Recommend Documents

Synthesizing Coupled 3D Face Modalities by Trunk-Branch Generative Adversarial Networks

Complement component face space for 3D face recognition from range images

XCAT-GAN for Synthesizing 3D Consistent Labeled Cardiac MR Images on Anatomically Variable XCAT Phantoms

Expressive Spaces in Digital 3D Cinema

Synthesizing Multi-Contrast MR Images Via Novel 3D Conditional Variational Auto-Encoding GAN

3D Face Recognition

Face Recognition, 3D-Based

Joint Face Alignment and 3D Face Reconstruction

3D Face Alignment Without Correspondences

Face Super-Resolution Guided by 3D Facial Priors

Synthesizing Realistic Brain MR Images with Noise Control

Biophysical parameters of coffee crop estimated by UAV RGB images