Hash Function Learning via Codewords

In this paper we introduce a novel hash learning framework that has two main distinguishing features, when compared to past approaches. First, it utilizes codewords in the Hamming space as ancillary means to accomplish its hash learning task. These codewo

PDF / 600,873 Bytes
16 Pages / 439.37 x 666.142 pts Page_size
64 Downloads / 263 Views

DOWNLOAD

REPORT

Department of Electrical Engineering and Computer Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL 32816, USA [email protected], [email protected] 2 Department of Electrical and Computer Engineering, Florida Institute of Technology, 150 W University Blvd, Melbourne, FL 32901, USA [email protected] Abstract. In this paper we introduce a novel hash learning framework that has two main distinguishing features, when compared to past approaches. First, it utilizes codewords in the Hamming space as ancillary means to accomplish its hash learning task. These codewords, which are inferred from the data, attempt to capture similarity aspects of the data’s hash codes. Secondly and more importantly, the same framework is capable of addressing supervised, unsupervised and, even, semisupervised hash learning tasks in a natural manner. A series of comparative experiments focused on content-based image retrieval highlights its performance advantages. Keywords: Hash function learning machine

1

·

Codeword

·

Support vector

Introduction

With the explosive growth of web data including documents, images and videos, content-based image retrieval (CBIR) has attracted plenty of attention over the past years [1]. Given a query sample, a typical CBIR scheme retrieves samples stored in a database that are most similar to the query sample. The similarity is gauged in terms of a pre-speciﬁed distance metric and the retrieved samples are the nearest neighbors of the query point w.r.t. this metric. However, exhaustively comparing the query sample with every other sample in the database may be computationally expensive in many current practical settings. Additionally, most CBIR approaches may be hindered by the sheer size of each sample; for example, visual descriptors of an image or a video may number in the thousands. Furthermore, storage of these high-dimensional data also presents a challenge. Considerable eﬀort has been invested in designing hash functions transforming the original data into compact binary codes to reap the beneﬁts of a potentially fast similarity search; note that hash functions are typically designed to preserve certain similarity qualities between the data. For example, approximate nearest neighbors (ANN) search [2] using compact binary codes in Hamming space was shown to achieve sub-liner searching time. Storage of the binary code is, obviously, also much more eﬃcient. c Springer International Publishing Switzerland 2015 A. Appice et al. (Eds.): ECML PKDD 2015, Part I, LNAI 9284, pp. 659–674, 2015. DOI: 10.1007/978-3-319-23528-8 41

660

Y. Huang et al.

Existing hashing methods can be divided into two categories: dataindependent and data-dependent. The former category does not use a data-driven approach to choose the hash function. For example, Locality Sensitive Hashing (LSH) [3] randomly projects and thresholds data into the Hamming space for generating binary codes, where closely located (in terms of Euclidean distances in the data’s native space) samples are likely to have simi

Data Loading...

Hash Function Learning via Codewords

Recommend Documents

NCASH: Nonlinear Cellular Automata-based Hash function

Fast Image Quality Assessment via Hash Code

Metalign: efficient alignment-based metagenomic profiling via containment min hash

An Efficient Construction of a Compression Function for Cryptographic Hash

Design of Hash Function Using Two Dimensional Cellular Automata

Non-interactive Composition of Sigma-Protocols via Share-then-Hash

Cryptographic Reverse Firewall via Malleable Smooth Projective Hash Functions

How to Build a Hash Function from Any Collision-Resistant Function

HASH-2

Learning to Hash with Binary Deep Neural Network

Biological Function of Learning

Hash-tree PCA: accelerating PCA with hash-based grouping