Perceptual Losses for Real-Time Style Transfer and Super-Resolution
We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a per-pixel loss between the output and ground-truth ima
- PDF / 8,895,808 Bytes
- 18 Pages / 439.37 x 666.142 pts Page_size
- 75 Downloads / 291 Views
Abstract. We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a per-pixel loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing perceptual loss functions based on high-level features extracted from pretrained networks. We combine the benefits of both approaches, and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time. Compared to the optimization-based method, our network gives similar qualitative results but is three orders of magnitude faster. We also experiment with single-image super-resolution, where replacing a per-pixel loss with a perceptual loss gives visually pleasing results.
Keywords: Style transfer
1
· Super-resolution · Deep learning
Introduction
Many classic problems can be framed as image transformation tasks, where a system receives some input image and transforms it into an output image. Examples from image processing include denoising, super-resolution, and colorization, where the input is a degraded image (noisy, low-resolution, or grayscale) and the output is a high-quality color image. Examples from computer vision include semantic segmentation and depth estimation, where the input is a color image and the output image encodes semantic or geometric information about the scene. One approach for solving image transformation tasks is to train a feedforward convolutional neural network in a supervised manner, using a per-pixel loss function to measure the difference between output and ground-truth images. Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46475-6 43) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part II, LNCS 9906, pp. 694–711, 2016. DOI: 10.1007/978-3-319-46475-6 43
Perceptual Losses for Real-Time Style Transfer and Super-Resolution Content
Gatys et al. [11]
Ours
Bicubic
SRCNN [13]
Perceptual loss
SuperResolution
Style Transfer
Style
695
Ground Truth
Fig. 1. Example results for style transfer (top) and ×4 super-resolution (bottom). For style transfer, we achieve similar results as Gatys et al. [11] but are three orders of magnitude faster. For super-resolution our method trained with a perceptual loss is able to better reconstruct fine details compared to methods trained with per-pixel loss.
This approach has been used for example by Dong et al. for super-resolution [1], by Cheng et al. for colorization [2,3], by Long et al. for segmentation [4], and by Eigen et al. for depth and surface normal prediction [5,6]. Such approaches are efficient at test-time, requiring only a forward
Data Loading...