The image annotation algorithm using convolutional features from intermediate layer of deep learning

  • PDF / 1,236,484 Bytes
  • 25 Pages / 439.37 x 666.142 pts Page_size
  • 17 Downloads / 214 Views

DOWNLOAD

REPORT


The image annotation algorithm using convolutional features from intermediate layer of deep learning Yuantao Chen 1 & Linwu Liu 1 & Jiajun Tao 1 & Xi Chen 1 & Runlong Xia 2 & Qian Zhang 3 & Jie Xiong 4 & Kai Yang 3 & Jingbo Xie 2 Received: 1 April 2020 / Revised: 7 September 2020 / Accepted: 16 September 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

The automatic image annotation is an effective computer operation that predicts the annotation of an unknown image by automatically learning potential relationships between the semantic concept space and the visual feature space in the annotation image dataset. Usually, the auto-labeling image includes the processing: learning processing and labeling processing. Existing image annotation methods that employ convolutional features of deep learning methods have a number of limitations, including complex training and high space/time expenses associated with the image annotation procedure. Accordingly, this paper proposes an innovative method in which the visual features of the image are presented by the intermediate layer features of deep learning, while semantic concepts are represented by mean vectors of positive samples. Firstly, the convolutional result is directly output in the form of low-level visual features through the mid-level of the pretrained deep learning model, with the image being represented by sparse coding. Secondly, the positive mean vector method is used to construct visual feature vectors for each text vocabulary item, so that a visual feature vector database is created. Finally, the visual feature vector similarity between the testing image and all text vocabulary is calculated, and the vocabulary with the largest similarity used for annotation. Experiments on the datasets demonstrate the effectiveness of the proposed method; in terms of F1 score, the proposed method’s performance on the Corel5k dataset and IAPR TC-12 dataset is superior to that of MBRM, JEC-AF, JEC-DF, and 2PKNN with end-to-end deep features. Keywords Deep learning . Image annotation . Convolutional results . Positive mean vector . Eigenvector

* Yuantao Chen [email protected] Extended author information available on the last page of the article

Multimedia Tools and Applications

1 Introduction In only two decades, automatic image annotation has become been a research hotspot in the fields of image processing [3], image classification [4], image segmentation [5], computer vision and pattern recognition [19], among others [11]. The success of image annotation tasks mainly depends on the annotation model and visual feature vectors used, wherein the quality of the visual feature vector determines the upper limit of image annotation quality. In recent years, as image annotation models have become more and more mature, visual feature vectors have increasingly become the decisive factor for image annotation effects. Image annotation technology implements keywords that the semantic content of the image, thereby narrowing the gap between the underl