Learning Representations for Automatic Colorization

We develop a fully automatic image colorization system. Our approach leverages recent advances in deep networks, exploiting both low-level and semantic representations. As many scene elements naturally appear according to multimodal color distributions, w

PDF / 5,840,208 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
29 Downloads / 340 Views

DOWNLOAD

REPORT

2

University of Chicago, Chicago, USA [email protected] Toyota Technological Institute at Chicago, Chicago, USA {mmaire,greg}@ttic.edu

Abstract. We develop a fully automatic image colorization system. Our approach leverages recent advances in deep networks, exploiting both low-level and semantic representations. As many scene elements naturally appear according to multimodal color distributions, we train our model to predict per-pixel color histograms. This intermediate output can be used to automatically generate a color image, or further manipulated prior to image formation. On both fully and partially automatic colorization tasks, we outperform existing methods. We also explore colorization as a vehicle for self-supervised visual representation learning.

1

Introduction

Colorization of grayscale images is a simple task for the human imagination. A human need only recall that sky is blue and grass is green; for many objects, the mind is free to hallucinate several plausible colors. The high-level comprehension required for this process is precisely why the development of fully automatic colorization algorithms remains a challenge. Colorization is thus intriguing beyond its immediate practical utility in graphics applications. Automatic colorization serves as a proxy measure for visual understanding. Our work makes this connection explicit; we unify a colorization pipeline with the type of deep neural architectures driving advances in image classiﬁcation and object detection. Both our technical approach and focus on fully automatic results depart from past work. Given colorization’s importance across multiple applications (e.g. historical photographs and videos [40], artist assistance [31,37]), much research strives to make it cheaper and less time-consuming [3,5–7,13,19,21,26,41]. However, most methods still require some level of user input [3,6,13,19,21,33]. Our work joins the relatively few recent eﬀorts on fully automatic colorization [5,7,26]. Some [5,7] show promising results on typical scenes (e.g. landscapes), but their success is limited on complex images with foreground objects. At a technical level, existing automatic colorization methods often employ a strategy of ﬁnding suitable reference images and transferring their color onto a target grayscale image [7,26]. This works well if suﬃciently similar reference Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46493-0 35) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part IV, LNCS 9908, pp. 577–593, 2016. DOI: 10.1007/978-3-319-46493-0 35

578

G. Larsson et al.

Fig. 1. Our automatic colorization of grayscale input; more examples in Figs. 3 and 4. (Color ﬁgure online)

Fig. 2. System overview. We process a grayscale image through a deep convolutional architecture (VGG) [36] and take spatially localized multilayer slices (hypercolumns) [14, 25, 27], as per-pixel descriptors. We train our system

Data Loading...

Learning Representations for Automatic Colorization

Recommend Documents

Learning Hybrid Representations for Automatic 3D Vessel Centerline Extraction

Adaptive Representations for Reinforcement Learning

Learning representations from dendrograms

Pictorial Representations and Learning

Learning Lane Graph Representations for Motion Forecasting

AutoAudio: Deep Learning for Automatic Audiogram Interpretation

Automatic Feature Learning for Glaucoma Detection Based on Deep Learning

Learning Graph-Convolutional Representations for Point Cloud Denoising

Learning Canonical Representations for Scene Graph to Image Generation

Learning Delicate Local Representations for Multi-person Pose Estimation

Fairness by Learning Orthogonal Disentangled Representations

Learning flat representations with artificial neural networks