Multi-label text classification with latent word-wise label information
- PDF / 740,061 Bytes
- 14 Pages / 595.224 x 790.955 pts Page_size
- 91 Downloads / 203 Views
Multi-label text classification with latent word-wise label information Ziheng Chen1 · Jiangtao Ren1
© Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Multi-label text classification (MLTC) is a significant task that aims to assign multiple labels to each given text. There are usually correlations between the labels in the dataset. However, traditional machine learning methods tend to ignore the label correlations. To capture the dependencies between the labels, the sequence-to-sequence (Seq2Seq) model is applied to MLTC tasks. Moreover, to reduce the incorrect penalty caused by the Seq2Seq model due to the inconsistent order of the generated labels, a deep reinforced sequence-to-set (Seq2Set) model is proposed. However, the label generation of the Seq2Set model still relies on a sequence decoder, which cannot eliminate the influence of the predefined label order and exposure bias. Therefore, we propose an MLTC model with latent word-wise label information (MLC-LWL), which constructs effective word-wise labeled information using a labeled topic model and incorporates the label information carried by the word and label context information through a gated network. With the word-wise label information, our model captures the correlations between the labels via a label-to-label structure without being affected by the predefined label order or exposure bias. Extensive experimental results illustrate the effectiveness and significant advantages of our model compared with the state-of-the-art methods. Keywords Multi-label text classification · Labeled topic model · Word-wise label information · abel-to-label structure
1 Introduction Multi-label text classification (MLTC) is a significant task in natural language processing (NLP) that aims to assign multiple labels for each given text. It is increasingly required in various modern applications, such as document categorization [21], tag suggestion [13], and context recommendation [38]. An early machine learning method, binary relevance (BR) [4], treats the MLTC problem as multiple independent binary classifications and achieves satisfactory performance. With the development of deep learning, neural
Jiangtao Ren
[email protected] Ziheng Chen [email protected] 1
School of Data and Computer Science, Sun Yat-Sen University, No. 132, Waihuandong Road, Guangzhou Higher Education Megacenter, 510006, Guangzhou, Guangdong, People’s Republic of China
networks are increasingly used to tackle MLTC tasks [2, 22, 36]. However, these models ignore the correlations between the labels. To capture the dependencies between the labels, methods such as ML-DT [7], rank-SVM [8], LP [30], ML-KNN [37], and CC [25] have been proposed. However, they tend to be computationally intractable when the number of labels increases since they construct a model for each label individually. In addition, Kurata et al. propose to capture the label correlations using an initialization method based on the convolutional neural network (CNN) [15], while Chen et al. m
Data Loading...