Discriminative Region Proposal Adversarial Network for High-Quality Image-to-Image Translation

  • PDF / 11,129,746 Bytes
  • 20 Pages / 595.276 x 790.866 pts Page_size
  • 38 Downloads / 202 Views

DOWNLOAD

REPORT


Discriminative Region Proposal Adversarial Network for High-Quality Image-to-Image Translation Chao Wang1 · Wenjie Niu1 · Yufeng Jiang1 · Haiyong Zheng1,2

· Zhibin Yu1 · Zhaorui Gu1 · Bing Zheng1

Received: 28 February 2019 / Accepted: 2 December 2019 © Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract Image-to-image translation has been made much progress with embracing Generative Adversarial Networks (GANs). However, it’s still very challenging for translation tasks that require high quality, especially at high-resolution and photo-reality. In this work, we present Discriminative Region Proposal Adversarial Network (DRPAN) for high-quality image-to-image translation. We decompose the image-to-image translation procedure into three iterated steps: the first is to generate an image with global structure but some local artifacts (via GAN), the second is to use our Discriminative Region Proposal network (DRPnet) for proposing the most fake region from the generated image, and the third is to implement “image inpainting” on the most fake region for yielding more realistic result through a reviser, so that the system (DRPAN) can be gradually optimized to synthesize images with more attention on the most artifact local part. We explore patch-based GAN to construct DRPnet for proposing the discriminative region to produce masked fake samples, further, we propose a reviser for GANs to distinguish real from masked fake for providing constructive revisions to the generator for producing realistic details, and serve as auxiliaries of the generator to synthesize high-quality results. In addition, we combine pix2pixHD with DRPAN to synthesize high-resolution results with much finer details. Moreover, we improve CycleGAN by DRPAN to address unpaired image-to-image translation with better semantic alignment. Experiments on a variety of paired and unpaired image-to-image translation tasks validate that our method outperforms the state of the art for synthesizing high-quality translation results in terms of both human perceptual studies and automatic quantitative measures. Our code is available at https://github.com/ godisboy/DRPAN. Keywords Image-to-image translation · GAN · Pix2pix · Pix2pixHD · CycleGAN · DRPAN

1 Introduction Communicated by Jun-Yan Zhu, Hongsheng Li, Eli Shechtman, Ming-Yu Liu, Jan Kautz, Antonio Torralba. Chao Wang, Wenjie Niu, and Yufeng Jiang contributed equally.

B B

Haiyong Zheng [email protected] Zhibin Yu [email protected] Chao Wang [email protected] Wenjie Niu [email protected] Yufeng Jiang [email protected] Zhaorui Gu [email protected] Bing Zheng [email protected]

From the aspect of human visual perception, a synthesized image is often considered fake because it contains local artifacts. Although the image looks real at first glance, we can still easily distinguish the fake image from the real one by gazing for only approximately 1000 ms (Chen et al. 2017). Human beings can draw a realistic scene from coarse structure to fine detail, that is, we us