Multi-label classification by formulating label-specific features from simultaneous instance level and feature level

  • PDF / 1,942,186 Bytes
  • 16 Pages / 595.224 x 790.955 pts Page_size
  • 100 Downloads / 183 Views

DOWNLOAD

REPORT


Multi-label classification by formulating label-specific features from simultaneous instance level and feature level Yuanyuan Guan1 · Wenhui Li1 · Boxiang Zhang1 · Bing Han2 · Manglai Ji1 Accepted: 7 October 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Multi-label learning (MLL) trains a classification model from multiple labelled datasets, where each training instance is annotated with a set of class labels simultaneously. Following the binary relevance MLL paradigm, a recently effective spirit is to constructing specific features for each label, instead of training over the original feature space. Existing label-specific methods, however, only consider the information from instance distributions, making the reconstructed features poorly discriminative. In this paper, we propose the generation of Label-spEcific feaTures by simultaneously exploring insTance distributions and fEatuRe distributions, and suggest a new method named LETTER. LETTER reconstructs two subsets of new features from the instance level and feature level, respectively. More concretely, from the instance level, LETTER incorporates a sparse constraint, and from the feature level, we cluster the original features to construct new features as an extension. The combination of these two new feature subsets is the final set of label-specific features. Extensive experiments on a total of 14 benchmark datasets verify the competitive performance of LETTER against the existing state-of-the-art MLL methods. Keywords Multi-label classification · Binary relevance · Label-specific feature · Feature distribution

1 Introduction Multi-label learning (MLL) [39] trains a classification model from multiple labelled datasets, where each instance is annotated with a set of class labels simultaneously. It has received extensive attention in many applications, such as image annotation, where each image may contain several semantic objects [4, 32], document classification, where  Wenhui Li

[email protected] Yuanyuan Guan [email protected] Boxiang Zhang [email protected] Bing Han [email protected] Manglai Ji [email protected] 1

College of Computer Science and Technology, Jilin University, Jilin, China

2

The Northeast Normal University, Jilin, China

each document can generalize to multiple topics [20, 27], and music emotion classification, where each piece of music can cover various emotions [25, 28], etc. Formally, let D = {X , Y } be the training dataset, where X = Rd denotes the d-dimensional feature space and Y = {0, 1}q ({lk ∈ Y |1 ≤ k ≤ q}) denotes the q-dimensional label space. The instance xi ∈ X is annotated with a subset of labels. For each instance, the value of lk is equal to 1/0, which indicates whether the label is relevant or not. The task of MLL is to learn a classifier h: X → Y , to predict labels for future instances. Over the past decades, a large number of algorithms have been proposed to perform the MLL task. Problem transformation is a straightforward yet effective methodology that transforms