A survey on automatic image annotation

  • PDF / 1,806,244 Bytes
  • 17 Pages / 595.224 x 790.955 pts Page_size
  • 87 Downloads / 225 Views

DOWNLOAD

REPORT


A survey on automatic image annotation Yilu Chen1 · Xiaojun Zeng1 · Xing Chen1 · Wenzhong Guo1

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Automatic image annotation is a crucial area in computer vision, which plays a significant role in image retrieval, image description, and so on. Along with the internet technique developing, there are numerous images posted on the web, resulting in the fact that it is a challenge to annotate images only by humans. Hence, many computer vision researchers are interested in automatic image annotation and make a great effort in optimizing its performance. Automatic image annotation is a task that assigns several tags in a limited vocabulary to describe an image. There are many algorithms proposed to tackle this problem and all achieve great performance. In this paper, we review seven algorithms for automatic image annotation and evaluate these algorithms leveraging different image features, such as color histograms and Gist descriptor. Our goal is to provide insights into the automatic image annotation. A lot of comprehensive experiments, which are based on Corel5K, IAPR TC-12, and ESP Game datasets, are designed to compare the performance of these algorithms. We also compare the performance of traditional algorithms employing deep learning features. Considering that not all associated labels are annotated by human annotators, we leverage the DIA metrics on IAPR TC-12 and ESP Game datasets. Keywords Computer vision · Image annotation · Tag assignment · Image retrieval

1 Introduction Along with the development of the Internet, the number of pictures increases in exponential form. For the sake of offering images that meet the demand of users, one of the important points is to tag images correctly. Due to a lot of online images without labels nowadays, it is arduous to retrieve images by text. It is widely acknowledged that the time cost and labor cost of manually labeling images are more expensive with the explosive growth of images. Moreover, the results of manually labeling images vary greatly from individual to individual. From what has been

 Xing Chen

[email protected]  Wenzhong Guo

[email protected] Yilu Chen [email protected] Xiaojun Zeng [email protected] 1

The College of Mathematics and Computer Science, Fuzhou University, Fujian, China

discussed above, automatic image annotation has great potential. If we improve the accuracy of image annotation, there has immense commercial interest. Current automatic image annotation aims to assign a little relevant words in a limited vocabulary to the images without labels. The process of most automatic image annotation algorithms is: First, extract features from the training images and testing images. Secondly, generate the annotation model according to training data. Finally, generate annotations according to the features of testing images. The detailed process is indicated in Fig. 1. From the process of annotation, two factors affect the result essentially, one is the