Network Flow Formulations for Learning Binary Hashing

The problem of learning binary hashing seeks the identification of a binary mapping for a set of n examples such that the corresponding Hamming distances preserve high fidelity with a given \(n \times n\) matrix of distances (or affinities). This formulat

  • PDF / 1,477,712 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 25 Downloads / 222 Views

DOWNLOAD

REPORT


University of Wisconsin–Whitewater, Whitewater, USA {mukherjl,sigmundTJ27}@uww.edu 2 University of Houston, Houston, USA [email protected] 3 University of Wisconsin–Madison, Madison, USA [email protected]

Abstract. The problem of learning binary hashing seeks the identification of a binary mapping for a set of n examples such that the corresponding Hamming distances preserve high fidelity with a given n×n matrix of distances (or affinities). This formulation has numerous applications in efficient search and retrieval of images (and other high dimensional data) on devices with storage/processing constraints. As a result, the problem has received much attention recently in vision and machine learning and a number of interesting solutions have been proposed. A common feature of most existing solutions is that they adopt continuous iterative optimization schemes which is then followed by a post-hoc rounding process to recover a feasible discrete solution. In this paper, we present a fully combinatorial network-flow based formulation for a relaxed version of this problem. The main maximum flow/minimum cut modules which drive our algorithm can be solved efficiently and can directly learn the binary codes. Despite its simplicity, we show that on most widely used benchmarks, our proposal yields competitive performance relative to a suite of nine different state of the art algorithms.

1

Introduction

The need to perform quick indexing and retrieval on large collections of images and videos, both in our personal collections and on the web, has led to a resurgence of interest in efficient methods for hashing. While the literature on this topic is quite mature, the need to accomplish retrieval tasks on novel mobile architectures has led to numerous interesting variations of the problem. For example, small form factor devices with limited storage capacity as well as the associated power consumption constraints typically involve unique economy/accuracy trade-offs for the algorithm. Separately, when the data correspond to images, notice that it may not be sensible to hash the native image representation expressed as a RD vector. Indeed, much of the image may include background clutter as well as other content not directly pertinent to the semantic information of the image. This leads to interesting variations of hashing c Springer International Publishing AG 2016  B. Leibe et al. (Eds.): ECCV 2016, Part V, LNCS 9909, pp. 366–381, 2016. DOI: 10.1007/978-3-319-46454-1 23

Network Flow Formulations for Learning Binary Hashing

367

where each to-be-hashed example may actually correspond to multiple patches (or salient objects) contained within an image. In vision applications of hashing, it is often the case that rather than represent the image in terms of its components as described above, it is more convenient to compute similarities between pairs of images in the dataset [1,2]. In practice, this is accomplished by running an object detector on each image and assessing whether each pair of images contain similar content. For inst