Unsupervised Visual Representation Learning by Graph-Based Consistent Constraints

Learning rich visual representations often require training on datasets of millions of manually annotated examples. This substantially limits the scalability of learning effective representations as labeled data is expensive or scarce. In this paper, we a

PDF / 6,837,801 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
66 Downloads / 213 Views

DOWNLOAD

REPORT

Tsinghua University, Beijing, China [email protected] 2 University of California, Merced, Merced, USA 3 University of Illinois, Urbana-Champaign, Champaign, USA https://sites.google.com/site/lidonggg930/feature-learning

Abstract. Learning rich visual representations often require training on datasets of millions of manually annotated examples. This substantially limits the scalability of learning eﬀective representations as labeled data is expensive or scarce. In this paper, we address the problem of unsupervised visual representation learning from a large, unlabeled collection of images. By representing each image as a node and each nearest-neighbor matching pair as an edge, our key idea is to leverage graph-based analysis to discover positive and negative image pairs (i.e., pairs belonging to the same and diﬀerent visual categories). Speciﬁcally, we propose to use a cycle consistency criterion for mining positive pairs and geodesic distance in the graph for hard negative mining. We show that the mined positive and negative image pairs can provide accurate supervisory signals for learning eﬀective representations using Convolutional Neural Networks (CNNs). We demonstrate the eﬀectiveness of the proposed unsupervised constraint mining method in two settings: (1) unsupervised feature learning and (2) semi-supervised learning. For unsupervised feature learning, we obtain competitive performance with several state-of-the-art approaches on the PASCAL VOC 2007 dataset. For semisupervised learning, we show boosted performance by incorporating the mined constraints on three image classiﬁcation datasets. Keywords: Unsupervised feature learning · Semi-supervised learning Image classiﬁcation · Convolutional neural networks

1

·

Introduction

Convolutional neural networks have recently achieved impressive performance on a broad range of visual recognition tasks [1–3]. However, the success of CNNs Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46493-0 41) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part IV, LNCS 9908, pp. 678–694, 2016. DOI: 10.1007/978-3-319-46493-0 41

Unsupervised Visual Representation Learning

679

Large variations

(a) Direct matching

(b) Cyclic matching

Fig. 1. Illustration of positive mining based on cycle consistency. (a) Direct image matching using similarity of the appearance features often results in matching pairs with very similar appearances (e.g., certain pose of cars). (b) By ﬁnding cycles in the graph, we observe that image pairs in the cycle are likely to belong to the same visual category but with large appearance variations (e.g., under diﬀerent viewpoints).

is mainly attributed to supervised learning over massive amounts of humanlabeled data. The need of large-scale manual annotations substantially limits the scalability of learning eﬀective representations as labeled data is expensive or scarce. In this paper, we address th

Data Loading...

Unsupervised Visual Representation Learning by Graph-Based Consistent Constraints

Recommend Documents

Unsupervised Visual Time-Series Representation Learning and Clustering

Unsupervised representation learning with Minimax distance measures

Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles

Visual Representation

Unsupervised Deep Representation Learning for Real-Time Tracking

Cross-Graph Representation Learning for Unsupervised Graph Alignment

Learning an Unsupervised and Interpretable Representation of Emotion from Speech

Visual Memorability for Robotic Interestingness via Unsupervised Online Learning

Unsupervised Learning

Unsupervised Learning

Representation Learning on Visual-Symbolic Graphs for Video Understanding

Learning Visual Context by Comparison