Stacking model of multi-label classification based on pruning strategies
- PDF / 810,589 Bytes
- 12 Pages / 595.276 x 790.866 pts Page_size
- 30 Downloads / 221 Views
(0123456789().,-volV)(0123456789().,-volV)
SMART DATA AGGREGATION INSPIRED PARADIGM & APPROACHES IN IOT APPLNS
Stacking model of multi-label classification based on pruning strategies Haiyang Liu1 • Zhihai Wang1 • Yange Sun1 Received: 17 September 2018 / Accepted: 12 November 2018 Springer-Verlag London Ltd., part of Springer Nature 2018
Abstract Exploiting dependencies between the labels is the key of improving the performance of multi-label classification. In this paper, we divide the utilizing methods of label dependence into two groups from the perspective of different ways of problem transformation: label grouping method and feature space extending method. As to the feature space extending method, we find that the common problem is how to measure the dependencies between labels and to select proper labels to add to the original feature space. Therefore, we propose a ReliefF-based pruning model for multi-label classification (ReliefF-based stacking, RFS). RFS measures the dependencies between labels in a feature selection perspective and then selects the more relative labels into the original feature space. Experimental results of 9 multi-label benchmark datasets shows that RFS is more effective compared to other advanced multi-label classification algorithms. Keywords Multi-label classification Label dependence Feature selection ReliefF
1 Introduction Multi-label classification (MLC) is a machine learning problem in which models are sought that assign a subset of labels to each instance, unlike conventional (single-class) classification that involves predicting only a single class [1]. The multi-label problem is receiving increased attention and is relevant to many domains such as text categorization [2], classification of music [3] and videos [4] and semantic annotation of images [5]. Recently, many studies are looking for efficient and accurate algorithms to cope with multi-label classification challenge. They are usually partitioned into two main categories: algorithm adaptation and problem transformation. Algorithm adaptation method extends specific learning algorithms in order to handle multi-label data directly. Problem transformation methods, on the other hand, are algorithm independent. They transform the learning task
& Haiyang Liu [email protected] 1
School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
into one or more single-label classification tasks. In this paper, we mainly focus on problem transformation methods. In multi-label classification problem, the labels will not occur independent of each other; instead, there are statistical dependencies between them. For example, the probability of a movie being annotated with label action would be high if we know it has labels crime and drama; a piece of news is unlikely to be labelled as military if it is related to sports. From a learning and prediction point of view, effective exploitation of the label dependencies information is crucial for the success of multi-label learning techniq
Data Loading...