Adversarial Training for Sketch Retrieval

Generative Adversarial Networks (GAN) are able to learn excellent representations for unlabelled data which can be applied to image generation and scene classification. Representations learned by GANs have not yet been applied to retrieval. In this paper,

  • PDF / 1,841,437 Bytes
  • 12 Pages / 439.37 x 666.142 pts Page_size
  • 30 Downloads / 195 Views

DOWNLOAD

REPORT


Abstract. Generative Adversarial Networks (GAN) are able to learn excellent representations for unlabelled data which can be applied to image generation and scene classification. Representations learned by GANs have not yet been applied to retrieval. In this paper, we show that the representations learned by GANs can indeed be used for retrieval. We consider heritage documents that contain unlabelled Merchant Marks, sketch-like symbols that are similar to hieroglyphs. We introduce a novel GAN architecture with design features that make it suitable for sketch retrieval. The performance of this sketch-GAN is compared to a modified version of the original GAN architecture with respect to simple invariance properties. Experiments suggest that sketch-GANs learn representations that are suitable for retrieval and which also have increased stability to rotation, scale and translation compared to the standard GAN architecture. Keywords: Deep learning · CNN · GAN · Generative models · Sketches

1

Introduction

Recently, the UK’s National Archives has collected over 70, 000 heritage documents that originate between the 16th and 19th centuries. These documents make up a small part of the “Prize Papers”, which are of gross historical importance, as they were used to establish legitimacy of ship captures at sea. This collection of documents contain Merchant Marks (see Fig. 4B), symbols used to uniquely identify the property of a merchant. For further historical research to be conducted, the organisation requires that the dataset be searchable by visual example (see Fig. 1). These marks are sparse line drawings, which makes it challenging to search for visually similar Merchant Marks between documents. This dataset poses the following challenges to learning representations that are suitable for visual search: 1. Merchant marks are line drawings, absent of both texture and colour, which means that marks cannot be distinguished based on these properties. 2. Many machine learning techniques, and most notably convolutional neural networks (CNNs), require large amounts of labelled training data, containing on the order of millions of labelled images [7]. None of the Merchant Marks are labelled, and in many cases it is not clear what labels would be assigned to them. This motivates an unsupervised approach to learning features. c Springer International Publishing Switzerland 2016  G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part I, LNCS 9913, pp. 798–809, 2016. DOI: 10.1007/978-3-319-46604-0 55

Adversarial Training for Sketch Retrieval

799

Fig. 1. An overview of the problem: the circled items contain examples of Merchant Marks; note that although some marks are distinct, they are still visually similar. We would like to retrieve visually similar examples, and find exact matches if they exist. Note that the two marks on the left are exact matches, while the others might be considered to be visually similar.

Fig. 2. Most marks in the Merchant Marks dataset are made of the above substructures which we refer to as parts.

3. Th