Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations
Multi-label learning has attracted significant interests in computer vision recently, finding applications in many vision tasks such as multiple object recognition and automatic image annotation. Associating multiple labels to a complex image is very diff
- PDF / 1,947,994 Bytes
- 17 Pages / 439.37 x 666.142 pts Page_size
- 108 Downloads / 187 Views
3
Rolls-Royce@NTU Corp Lab, Singapore, Singapore [email protected] 2 IHPC, A*STAR, Singapore, Singapore [email protected] School of Computer Science and Engineering, NTU, Singapore, Singapore [email protected]
Abstract. Multi-label learning has attracted significant interests in computer vision recently, finding applications in many vision tasks such as multiple object recognition and automatic image annotation. Associating multiple labels to a complex image is very difficult, not only due to the intricacy of describing the image, but also because of the incompleteness nature of the observed labels. Existing works on the problem either ignore the label-label and instance-instance correlations or just assume these correlations are linear and unstructured. Considering that semantic correlations between images are actually structured, in this paper we propose to incorporate structured semantic correlations to solve the missing label problem of multi-label learning. Specifically, we project images to the semantic space with an effective semantic descriptor. A semantic graph is then constructed on these images to capture the structured correlations between them. We utilize the semantic graph Laplacian as a smooth term in the multi-label learning formulation to incorporate the structured semantic correlations. Experimental results demonstrate the effectiveness of the proposed semantic descriptor and the usefulness of incorporating the structured semantic correlations. We achieve better results than state-of-the-art multi-label learning methods on four benchmark datasets.
1
Introduction
Multi-label learning has been an important research topic in machine learning [1–3] and data mining [4,5]. Unlike conventional classification problems, in multi-label learning each instance can be associated with multiple labels simultaneously. During recent years, multi-label learning has been applied on many computer vision tasks, especially on visual object recognition [6–8] and automatic image annotation [9–11]. In addition to the difficulty of assigning multiple labels/tags to complex images, multi-label learning often encounters the problem of incomplete labels. In real world scenarios, since the number of possible labels/tags is often very large (could be as large as the whole vocabulary set) and there often exist ambiguities among labels (e.g., “car” vs “SUV”), it is very c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part I, LNCS 9905, pp. 835–851, 2016. DOI: 10.1007/978-3-319-46448-0 50
836
H. Yang et al.
difficult to obtain a perfectly labeled training set. Figure 1 shows some examples of annotations from Flickr25K dataset. We can see that many possible labels are missing as it is impossible for labelers to go through the entire vocabulary set to extract all proper tags.
Fig. 1. Example labels from Flickr25K dataset. The bold face labels are original annotations from the users. The italic labels are other possible labels. These examples illustrate the missing labels problem of multi-l
Data Loading...