Semi-supervised discrete hashing for efficient cross-modal retrieval
- PDF / 1,236,679 Bytes
- 22 Pages / 439.642 x 666.49 pts Page_size
- 66 Downloads / 218 Views
Semi-supervised discrete hashing for efficient cross-modal retrieval Xingzhi Wang1 · Xin Liu1 Ji-Xiang Du2
· Shu-Juan Peng2 · Bineng Zhong1 · Yewang Chen2 ·
Received: 21 August 2019 / Revised: 6 May 2020 / Accepted: 8 June 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Cross-modal hashing has recently gained significant popularity to facilitate multimedia retrieval across different modalities. Since the acquisition of large-scale labeled training data are very labor intensive, most supervised cross-modal hashing methods are uncompetitive for real applications. With limited label available, this paper presents a novel Semi-Supervised Discrete Hashing (SSDH) for efficient cross-modal retrieval. In contrast to most semi-supervised cross-modal hashing works that need to predict the label of unlabeled data, our proposed approach groups the labeled and unlabeled data together, and exploits the informative unlabeled data to promote hashing code learning directly. Specifically, the proposed SSDH approach utilizes the relaxed hash representations to characterize each modality, and learns the semi-supervised semantic-preserving regularization to correlate the semantic consistency between the heterogeneous modalities. Accordingly, an efficient objective function is proposed to learn the hash representation, while designing an efficient optimization algorithm to optimize the hash codes for both labeled and unlabeled data. Without sacrificing the retrieval performance, the proposed SSDH method is adaptive to benefit various kinds of retrieval tasks, i.e., unsupervised, semi-supervised and supervised. Experimental results compared with several competitive algorithms show the effectiveness of the proposed method and its superiority over state-of-the-arts. Keywords Semi-supervised discrete hashing · Cross-modal retrieval · Relaxed hash representation · Semantic consistency
1 Introduction With the tremendous explosion of multimedia data such as image and text on the Internet, recent years have witnessed the great success of cross-modal retrieval techniques for Xin Liu
[email protected] 1
Department of Computer Science and Technology, Huaqiao University, Jimei Road, No. 668, Jimei District, Xiamen, Fujian, China
2
Fujian Key Laboratory of Big Data Intelligence and Security, Jimei Road, No. 668, Jimei District, Xiamen, Fujian, China
Multimedia Tools and Applications
similarity search across different modalities. More specifically, a user can utilize a query with one type of modality (e.g., images) to retrieve relevant items with another type of modality (e.g., texts). Evidently, the searching results of cross-modal retrieval often contain rich semantic information in different modalities, which are more comprehensive than the results from single-modal retrieval counterparts. Since the distribution and representation of different modalities are generally inconsistent, the correlation mining across these heterogeneous modalities becomes an essential issue that needs to be addressed in many
Data Loading...