Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labels
- PDF / 1,516,742 Bytes
- 13 Pages / 595.224 x 790.955 pts Page_size
- 38 Downloads / 235 Views
Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labels Lijuan Sun1 · Gengyu Lyu1 · Songhe Feng1
· Xiankai Huang2
© Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Weakly-supervised multi-label learning has received much attention more recently, and most of the existing methods focus on such problem with either missing or noisy labels, while the issue with both missing and noisy labels has not been well investigated. In this paper, we propose a novel COst-sensitive label Ranking Approach with Low-rank and Sparse constraints (CORALS) to enrich the missing labels and remove the noisy labels simultaneously. Unlike most existing studies that an indicator matrix needs to be given in advance which may not be available in reality, a label confidence matrix is constructed to reflect the relevance between the labels and the corresponding instances, and then the relevance ordering of all possible labels including both missing and noisy labels on each instance is optimized by minimizing a cost-sensitive ranking loss. By considering the dependencies in both feature space and label space, we exploit the dual low-rank regularization terms to capture the corresponding correlations. Afterwards, noticing the fact that both missing and noisy labels are rare, the sparse regularization term is encoded to constrain such noisy information to be sparse. Comprehensive experimental results demonstrate the effectiveness of the proposed method. Keywords Multi-label learning · Incomplete and noisy labels · Cost-sensitive · Low-rank and sparse · Label correlations
1 Introduction Multi-Label Learning (MLL), which aims to learn a model from the data where one instance is associated with multiple class labels [12, 18, 28, 32], has been widely applied in various fields such as image annotation and text categorization.
Lijuan Sun and Gengyu Lyu contributed equally to this work. Songhe Feng
[email protected] Lijuan Sun [email protected] Gengyu Lyu [email protected] Xiankai Huang [email protected] 1
School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
2
Beijing Technology and Business University, Beijing, China
In traditional multi-label learning algorithms, a common assumption is that each training sample has been precisely annotated with all of its relevant labels. BR [1] and RankSVM [8] are two classical multi-label learning methods that directly train the multi-label learning classifications based on precisely labeled examples. Different from the methods discussed above, LIFT [26] exploits label-specific features to induce the multi-label predictor. Recently, weakly-supervised multi-label learning has emerged as a hot topic [4, 7, 13, 19, 25] due to the existence of incomplete and/or noisy labels in multi-label learning framework. The former implies that the training examples may be assigned with a subset of relevant labels while the others are missing; The latter assumes that the labels assigned to the training examples may have
Data Loading...