A presentation and retrieval hash scheme of images based on principal component analysis
- PDF / 5,224,366 Bytes
- 14 Pages / 595.276 x 790.866 pts Page_size
- 11 Downloads / 181 Views
ORIGINAL ARTICLE
A presentation and retrieval hash scheme of images based on principal component analysis Chunyan Shuai1
· Xu Wang1 · Min He1 · Xin Ouyang2 · Jun Yang1
Accepted: 2 September 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract Image representation and approximate query is always a research challenge and is affected greatly by the dimension and size of images. Since hash-based methods and binary encodings in combination with other techniques, such as kernel tricks, a longer binary code and mapping vectors rotation, can maintain a linear query time and query accuracy, they have been used in this area broadly. This paper develops principal component analysis hashing (PCAH) and unequal length of binary coding to divide images into more categories, denoted as PCA-MD, to improve accuracy of the representation and lookup of images. This paper firstly proves that the eigenvector mapping is locality sensitive, which is the basis for more classes division. For the anisotropy of the eigenvectors, PCA-MD utilizes an unequal length of binary coding and fewer eigenvectors, rather than an equal code, to divide the images mapped on every eigenvector to more categories. Moreover, L1-norm distance is applied to measure the distances of images to avoid the enormous computation of Euclidean distance. Theoretical analysis and extensive experimental results demonstrate that the PCA-MD has a higher query performance and a slight longer run time than the state-of-the-art approaches based on the Hamming distance. This in turn verifies that PCAH is a locality sensitive hash and that partitioning into more categories rather than only two categories is reasonable. Keywords Eigenvector · High-dimensional vectors · K-nearest neighbors · Locality sensitive hash · Multi-dividing · Principal component analysis hashing
1 Introduction Representation and approximate nearest neighbors search of images is a fundamental operation underlying many computer vision approaches. Given a dataset X {x1 , . . . , x N }
B
Xin Ouyang [email protected] Chunyan Shuai [email protected] Xu Wang [email protected] Min He [email protected] Jun Yang [email protected]
1
Faculty of Transportation Engineering, Kunming University of Science and Technology, Kunming, China
2
Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
with d dimensions and a query q, the goal of the representation and nearest neighbors search is to encode every element of the dataset and find the closest points within a certain distance or the first K-nearest neighbors (KNN) with minimum distance from the query, namely arg min xi − q p [1, 2]. Due to O(d N ) time costs, brute force search becomes prohibitively expensive as the dataset size and dimensions grow. To reduce search complexity, a number of data encoding structures and algorithms have been proposed. Li et al. [3] conducts a comprehensive experimental evaluation of 16 state-of-the-art methods for approximate nearest neighbor search, inclu
Data Loading...