Compositional GAN: Learning Image-Conditional Binary Composition

PDF / 3,985,819 Bytes
16 Pages / 595.276 x 790.866 pts Page_size
46 Downloads / 286 Views

Compositional GAN: Learning Image-Conditional Binary Composition Samaneh Azadi1 · Deepak Pathak1 · Sayna Ebrahimi1 · Trevor Darrell1 Received: 20 April 2019 / Accepted: 30 April 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Generative Adversarial Networks can produce images of remarkable complexity and realism but are generally structured to sample from a single latent source ignoring the explicit spatial interaction between multiple entities that could be present in a scene. Capturing such complex interactions between different objects in the world, including their relative scaling, spatial layout, occlusion, or viewpoint transformation is a challenging problem. In this work, we propose a novel self-consistent Composition-by-Decomposition network to compose a pair of objects. Given object images from two distinct distributions, our model can generate a realistic composite image from their joint distribution following the texture and shape of the input objects. We evaluate our approach through qualitative experiments and user evaluations. Our results indicate that the learned model captures potential interactions between the two object domains, and generates realistic composed scenes at test time. Keywords Conditional Generative Adversarial Network · Composition · Decomposition

1 Introduction Conditional Generative Adversarial Networks (cGANs) have emerged as a powerful method for generating images conditioned on a given input. The input cue could be in the form of an image (Isola et al. 2017; Zhu et al. 2017a; Liu et al. 2017; Azadi et al. 2017; Wang et al. 2017; Pathak et al. 2016), a text phrase (Zhang et al. 2017; Reed et al. 2016b, a; Johnson et al. 2018) or a class label layout (Mirza and Osindero 2014; Odena et al. 2016; Antoniou et al. 2017). The goal in most of these GAN instances is to learn a mapping that translates a given sample from the source distribution to generate a sample from the output distribution. This primarily involves transforming either a single object of interest (apples to oranges, horses to zebras, label to image, etc.) Communicated by Jun-Yan Zhu, Hongsheng Li, Eli Shechtman, MingYu Liu, Jan Kautz, Antonio Torralba.

B

Samaneh Azadi [email protected] Deepak Pathak [email protected] Sayna Ebrahimi [email protected] Trevor Darrell [email protected]

1

University of California, Berkeley, USA

or changing the style and texture of the input image (day to night, etc.). However, these direct transformations do not capture the fact that a natural image is a 2D projection of a composition of multiple objects interacting in a 3D visual world. Here, we explore the role of compositionality in GAN frameworks and propose a new method which learns a function that maps images of different objects sampled from their marginal distributions (e.g., chair and table) into a combined sample (table–chair) that captures the joint distribution of object pairs. In this paper, we specifically focus on the composition of a pair of objects. Modeli

Data Loading...

Compositional GAN: Learning Image-Conditional Binary Composition

Recommend Documents

Learning Architectures for Binary Networks

Strain and Compositional Analysis of InGaN/GaN Layers

Composition Learning in Music Education

Category-preserving binary feature learning and binary codebook learning for finger vein recognition

Network Flow Formulations for Learning Binary Hashing

Information Theoretic Rotationwise Robust Binary Descriptor Learning

Learning Object Placement by Inpainting for Compositional Data Augmentation

Visual Compositional Learning for Human-Object Interaction Detection

Effect of Compositional Short Range Order on Glass Formation in Binary Metallic System

Optical Gain Spectra in InGaN/GaN Quantum Wells with the Compositional Fluctuations

Hydrostatic Pressure Studies of GaN/AlGaN/GaN Heterostructure Devices with Varying AlGaN Thickness and Composition

Learning to Hash with Binary Deep Neural Network