Label Embedding for Multi-label Classification Via Dependence Maximization
- PDF / 1,086,780 Bytes
- 24 Pages / 439.37 x 666.142 pts Page_size
- 107 Downloads / 213 Views
Label Embedding for Multi-label Classification Via Dependence Maximization Yachong Li1 · Youlong Yang1
© Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Multi-label classification has aroused extensive attention in various fields. With the emergence of high-dimensional label space, academia has devoted to performing label embedding in recent years. Whereas current embedding approaches do not take feature space correlation sufficiently into consideration or require an encoding function while learning embedded space. Besides, few of them can be spread to track the missing labels. In this paper, we propose a Label Embedding method via Dependence Maximization (LEDM), which obtains the latent space on which the label and feature information can be embedded simultaneously. To end this, the low-rank factorization model on the label matrix is applied to exploit label correlations instead of the encoding process. The dependence between feature space and label space is increased by the Hilbert–Schmidt independence criterion to facilitate the predictability. The proposed LEDM can be easily extended the missing labels in learning embedded space at the same time. Comprehensive experimental results on data sets validate the effectiveness of our approach over the state-of-art methods on both complete-label and missing-label cases. Keywords Multi-label learning · Label embedding · Low-rank factorization · Hilbert–Schmidt independence criterion · Missing labels
1 Introduction In machine learning, multi-label classification refers to the situation where an instance is associated with a set of labels simultaneously. It has such widespread applications, including text classification [1], categorization of genes [2], image annotation [3] and so on. Therefore, multi-label classification has caused more and more attention of academic circles. Currently, there are two principal approaches with respect to multi-label learning. One is called problem transformation, which is to switch multi-label classification tasks into multiple
B
Youlong Yang [email protected] Yachong Li [email protected]
1
School of Mathematics and Statistics, Xidian University, Xi’an 710071, People’s Republic of China
123
Y. Li, Y. Yang
single label classification tasks, such as Binary Relevance (BR) [4,5], Label Power-set [6] and Classifier Chain [7]. Another is algorithm adaptation. It extents available classification techniques directly, for instance, Multi-label K-Nearest Neighbor [8] and Adaboost.MH [9]. Nevertheless, with the exponential increase of the number of labels, it is computationally impractical for many conventional multi-label classification algorithms to work in the initial label space. Under such conditions, a great number of label embedding methods are designed to alleviate the problem, which not only improves the classification performance, but also reduces the cost of training and predicting. Label embedding is a popular paradigm by viewing all possible label sets as a highdimensional label vector, whic
Data Loading...