Incremental hashing with sample selection using dominant sets

  • PDF / 4,871,433 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 14 Downloads / 170 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Incremental hashing with sample selection using dominant sets Wing W. Y. Ng1   · Xiaoxia Jiang1 · Xing Tian1 · Marcello Pelillo2,3 · Hui Wang4 · Sam Kwong5 Received: 22 November 2019 / Accepted: 4 June 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract In the world of big data, large amounts of images are available in social media, corporate and even personal collections. A collection may grow quickly as new images are generated at high rates. The new images may cause changes in the distribution of existing classes or the emergence of new classes, resulting in the collection being dynamic and having concept drift. For efficient image retrieval from an image collection using a query, a hash table consisting of a set of hash functions is needed to transform images into binary hash codes which are used as the basis to find similar images to the query. If the image collection is dynamic, the hash table built at one time step may not work well at the next due to changes in the collection as a result of new images being added. Therefore, the hash table needs to be rebuilt or updated at successive time steps. Incremental hashing (ICH) is the first effective method to deal with the concept drift problem in image retrieval from dynamic collections. In ICH, a new hash table is learned based on newly emerging images only which represent data distribution of the current data environment. The new hash table is used to generate hash codes for all images including old and new ones. Due to the dynamic nature, new images of one class may not be similar to old images of the same class. In order to learn new hash table that preserves within-class similarity in both old and new images, incremental hashing with sample selection using dominant sets (ICHDS) is proposed in this paper, which selects representative samples from each class for training the new hash table. Experimental results show that ICHDS yields better retrieval performance than existing dynamic and static hashing methods. Keywords  Image retrieval · Incremental hashing · Semi-supervised hashing · Concept drift · Dominant sets

1 Introduction With the rapid development of digital technologies, multimedia data such as videos, images and audios are generated in large quantities and at high rates. Similarity search is thus

* Xing Tian [email protected] 1



Guangdong Provincial Key Laboratory of Computational Intelligence and Cyberspace Information, School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, Guangdong, China

2



Department of Environmental Sciences, Informatics and Statistics, University of Venice, 30172 Venice, Italy

3

European Centre for Living Technology, University of Venice, 30172 Venice, Italy

4

School of Computing, Ulster University, Jordanstown, UK

5

Department of Computer Science, City University of Hong Kong, Hong Kong, China



becoming more and more important. How to quickly find the most relevant data from large collections of multimedia data is a