A two-stage hybrid probabilistic topic model for refining image annotation

  • PDF / 2,574,995 Bytes
  • 15 Pages / 595.276 x 790.866 pts Page_size
  • 64 Downloads / 184 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

A two‑stage hybrid probabilistic topic model for refining image annotation Dongping Tian1 · Zhongzhi Shi2 Received: 6 October 2018 / Accepted: 10 July 2019 © Springer-Verlag GmbH Germany, part of Springer Nature 2019

Abstract Refining image annotation has become one of the core research topics in computer vision and pattern recognition due to its great potentials in image retrieval. However, it is still in its infancy and is not sophisticated enough to extract perfect semantic concepts just according to the image low-level features. In this paper, we propose a two-stage hybrid probabilistic topic model to improve the quality of automatic image annotation. To start with, a probabilistic latent semantic analysis model with asymmetric modalities is learned to estimate the posterior probabilities of each annotation keyword, during which the image-to-word relation can be well established. Next, a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels. By this way, the information from image low-level visual features and high-level semantic concepts can be seamlessly integrated by fully taking into account the word-to-word and image-to-image relations. Finally, the rank-two relaxation heuristics is exploited to further mine the correlation of the candidate annotations so as to capture the refining results, which plays a critical role in semantic based image retrieval. Extensive experiments show that the proposed model achieves not only superior annotation accuracy but also better retrieval performance. Keywords  Refining image annotation · Semantic gap · Expectation–maximization · PLSA · Max-bisection · Image retrieval

1 Introduction With the rapid development of multimedia information technology, image retrieval has become more and more important in Internet and other multimedia platforms. However, in the community of content-based image retrieval (CBIR), as well-known, the problem caused by the semantic gap between image low-level visual feature and high-level semantic information seriously degrades the performance of CBIR [14]. Aiming at narrowing down the problem of semantic gap, automatic image annotation (AIA) is a promising way and has received an extensive attention in the multimedia research community [1], whose goal is to find * Dongping Tian [email protected]; [email protected] 1



Institute of Computer Software, Baoji University of Arts and Sciences, Baoji 721007, Shaanxi, People’s Republic of China



Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, 100190 Beijing, People’s Republic of China

2

suitable annotation words to represent the visual content of an untagged or noisily tagged image. During the past years, many methods have been developed for AIA, and most of them can be roughly classified into two categories: classification-based method and probabilistic modeling method. To be specific, the classificat