DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation

In this work, we consider the task of generating highly-realistic images of a given face with a redirected gaze. We treat this problem as a specific instance of conditional image generation and suggest a new deep architecture that can handle this task ver

PDF / 3,624,760 Bytes
16 Pages / 439.37 x 666.142 pts Page_size
101 Downloads / 312 Views

DOWNLOAD

REPORT

Abstract. In this work, we consider the task of generating highlyrealistic images of a given face with a redirected gaze. We treat this problem as a speciﬁc instance of conditional image generation and suggest a new deep architecture that can handle this task very well as revealed by numerical comparison with prior art and a user study. Our deep architecture performs coarse-to-ﬁne warping with an additional intensity correction of individual pixels. All these operations are performed in a feed-forward manner, and the parameters associated with diﬀerent operations are learned jointly in the end-to-end fashion. After learning, the resulting neural network can synthesize images with manipulated gaze, while the redirection angle can be selected arbitrarily from a certain range and provided as an input to the network. Keywords: Gaze correction · Warping vised learning · Deep learning

1

· Spatial transformers · Super-

Introduction

In this work, we consider the task of learning deep architectures that can transform input images into new images in a certain way (deep image resynthesis). Generally, using deep architectures for image generation has become a very active topic of research. While a lot of very interesting results have been reported over recent years and even months, achieving photo-realism beyond the task of synthesizing small patches has proven to be hard. Previously proposed methods for deep resynthesis usually tackle the resynthesis problem in a general form and strive for universality. Here, we take an opposite approach and focus on a very speciﬁc image resynthesis problem (gaze manipulation) that has a long history in the computer vision community [1,7,13,16,18,20,24,26,27] and some important real-life applications. We show that by restricting the scope of the method and exploiting the speciﬁcs of the task, we are indeed able to train deep architectures that handle gaze manipulation well and can synthesize output images of high realism (Fig. 1). Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46475-6 20) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part II, LNCS 9906, pp. 311–326, 2016. DOI: 10.1007/978-3-319-46475-6 20

312

Y. Ganin et al.

Fig. 1. Gaze redirection with our model trained for vertical gaze redirection. The model takes an input image (middle row) and the desired redirection angle (here varying between −15 and +15◦ ) and re-synthesize the new image with the new gaze direction. Note the preservation of ﬁne details including specular highlights in the resynthesized images.

Generally, few image parts can have such a dramatic eﬀect on the perception of an image like regions depicting eyes of a person in this image. Humans (and even non-humans [23]) can infer a lot of information about of the owner of the eyes, her intent, her mood, and the world around her, from the appearance of the eyes and, in particular, from the direction of the gaz

Data Loading...

DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation

Recommend Documents

Photorealistic Image Synthesis Using SPADE Algorithm

StyleGAN2 Distillation for Feed-Forward Image Manipulation

Visualization, Photorealistic and Non-photorealistic

Non-Photorealistic Computer Graphics

SPAN: Spatial Pyramid Attention Network for Image Manipulation Localization

Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation

Virtual Microphones for Multichannel Audio Resynthesis

Non-Photorealistic Rendering

Gaze Tracking

Joint Bilateral Learning for Real-Time Universal Photorealistic Style Transfer

Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

Social Anxiety and Gaze Avoidance: Averting Gaze but not Anxiety