Light-Weight Spatial Distribution Embedding of Adjacent Features for Image Search
Binary code embedding methods can effectively compensate the quantization error of bag-of-words (BoW) model and remarkably improve the image search performance. However, the existing embedding schemes commonly generate binary code by projecting local feat
- PDF / 450,076 Bytes
- 11 Pages / 439.37 x 666.14 pts Page_size
- 94 Downloads / 184 Views
Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing 100044, China 3 Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University, Yichang 443002, Hubei, China [email protected] 2
Abstract. Binary code embedding methods can effectively compensate the quantization error of bag-of-words (BoW) model and remarkably improve the image search performance. However, the existing embedding schemes commonly generate binary code by projecting local feature from original feature space into a compact binary space. The spatial relationship between the local feature and its neighbors are ignored. In this paper, we proposed two light-weight binary code embedding schemes, named content similarity embedding (CSE) and scale similarity embedding (SSE), to better balance the image search performance and resource cost. Specially, the spatial distribution information for any local feature and its nearest neighbors are encoded into only several bits, which are used to verify the asserted matches of local features. The experimental results show that the proposed image search scheme achieves a better balance between image search performance and resource usage (i.e., time cost and memory usage). Keywords: Image search · Product quantization · Embedding · Bow
1
Introduction
Content-based image search is the core technique for many real-world visual applications, such as frame fusion based video copy detection [1], logo detection [2], visual content recognition [3]. However, image search remains a challenge due to the deviation of semantic understanding between human and computer, and the appearance variations in scale, orientation, illuminations, etc. [4]. In consideration of the robustness and effectiveness of local visual features, the image searching frameworks based on local features are commonly employed in both research and industrial areas. Local features like SIFT [5], SURF [6], etc., are originally proposed for image matching, which are generally invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion. Nevertheless, original matching schemes between © Springer-Verlag Berlin Heidelberg 2015 H. Zha et al. (Eds.): CCCV 2015, Part I, CCIS 546, pp. 387–397, 2015. DOI: 10.1007/978-3-662-48558-3_39
388 Y. Zhang et al.
two images are generally based on the similarity measurement of local feature sets, which requires large cost in both computation and storage. To facilitate the image search with large scale image datasets, pioneering scheme, named Bag-of-Words (BoWs) model [7], is proposed for significantly simplifying the matching process. The key idea of BoW is to quantize each local feature into one or several so-called visual words, and represent each image as a collection of orderless visual words. After mapping local features into visual words, lots of excellent techniques in text retr
Data Loading...