Binary Hashing with Semidefinite Relaxation and Augmented Lagrangian

This paper proposes two approaches for inferencing binary codes in two-step (supervised, unsupervised) hashing. We first introduce an unified formulation for both supervised and unsupervised hashing. Then, we cast the learning of one bit as a Binary Quadr

  • PDF / 396,090 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 58 Downloads / 202 Views

DOWNLOAD

REPORT


Abstract. This paper proposes two approaches for inferencing binary codes in two-step (supervised, unsupervised) hashing. We first introduce an unified formulation for both supervised and unsupervised hashing. Then, we cast the learning of one bit as a Binary Quadratic Problem (BQP). We propose two approaches to solve BQP. In the first approach, we relax BQP as a semidefinite programming problem which its global optimum can be achieved. We theoretically prove that the objective value of the binary solution achieved by this approach is well bounded. In the second approach, we propose an augmented Lagrangian based approach to solve BQP directly without relaxing the binary constraint. Experimental results on three benchmark datasets show that our proposed methods compare favorably with the state of the art.

Keywords: Two-step hashing mented Lagrangian

1

·

Semidefinite programming

·

Aug-

Introduction

Hashing methods construct a set of hash functions that map the original high dimensional data into low dimensional binary data. The resulted binary vectors allow efficient storage and fast searching, making hashing as an attractive approach for large scale visual search [1,2]. Existing hashing methods can be categorized as data-independent and datadependent schemes. Data-independent hashing methods [3–6] rely on random projections for constructing hash functions. Data-dependent hashing methods use available training data for learning hash functions in unsupervised or supervised way. Unsupervised hashing methods, e.g. Spectral Hashing [7], Iterative Quantization (ITQ) [8], K-means Hashing [9], Spherical Hashing [10], Nonnegative Matrix Factorization (NMF) hashing [11], try to preserve the distance similarity of samples. Supervised hashing methods, e.g. Minimal Loss Hashing [12], ITQ-CCA [8], Binary Reconstructive Embedding [13], KSH [14], Two-Step Hashing [15], FastHash [16], try to preserve the label similarity of samples. Most aforementioned hashing methods follow two general steps for computing binary codes. The first step is to define hash functions together with a specific c Springer International Publishing AG 2016  B. Leibe et al. (Eds.): ECCV 2016, Part II, LNCS 9906, pp. 802–817, 2016. DOI: 10.1007/978-3-319-46475-6 49

Binary Hashing with Semidefinite Relaxation and Augmented Lagrangian

803

loss function. Usually, the hash functions take the linear form [3,8,12] or nonlinear (e.g. kernel) form [4,13,14]. The loss functions are typically defined by minimizing the difference between Hamming affinity (or distance) of data pairs and the ground truth [13–16]. The second step is to solve hash function parameters by minimizing the loss function under the binary constraint on the codes. The coupling of the hash function and the binary constraint often results in a highly non-convex optimization which is very challenging to solve. Furthermore, because the hash functions vary for different methods, different optimization techniques are needed for each of them. 1.1

Related Work

Our work is inspired by a few recent supervised ha