Adaptive Decision Threshold-Based Extreme Learning Machine for Classifying Imbalanced Multi-label Data

  • PDF / 2,702,642 Bytes
  • 23 Pages / 439.37 x 666.142 pts Page_size
  • 10 Downloads / 196 Views

DOWNLOAD

REPORT


Adaptive Decision Threshold-Based Extreme Learning Machine for Classifying Imbalanced Multi-label Data Shang Gao1,2 · Wenlu Dong1 · Ke Cheng1 · Xibei Yang1 · Shang Zheng1,2 · Hualong Yu1,2 Accepted: 31 August 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Multi-label learning is a popular area of machine learning research as it is widely applicable to many real-world scenarios. In comparison with traditional binary and multi-classification tasks, the multi-label data are more easily impacted or destroyed by an imbalanced data distribution. This paper describes an adaptive decision threshold-based extreme learning machine algorithm (ADT-ELM) that addresses the imbalanced multi-label data classification problem. Specifically, the macro and micro F-measure metrics are adopted as the optimization functions for ADT-ELM, and the particle swarm optimization algorithm is employed to determine the optimal decision threshold combination. We use the optimized thresholds to make decision for future multi-label instances. Twelve baseline multi-label data sets are used in a series of experiments o verify the effectiveness and superiority of the proposed algorithm. The experimental results indicate that the proposed ADT-ELM algorithm is significantly superior to many state-of-the-art multi-label imbalance learning algorithms, and it generally requires less training time than more sophisticated algorithms. Keywords Multi-label classification · Class imbalance learning · Stochastic optimization · Decision threshold moving · Extreme learning machine

1 Introduction In supervised learning, the most widely studied case is that of the single-label learning, whereby only a class label is assigned to an instance. In recent years, however, a new supervised learning paradigm named multi-label learning has attracted the attentions of many researchers involved in machine learning [1]. Different from traditional single-label learning, multi-label learning simultaneously designates many different labels for the same instance.

B

Hualong Yu [email protected]

1

School of Computer, Jiangsu University of Science and Technology, No. 2, Mengxi Road, Zhenjiang 212003, Jiangsu, People’s Republic of China

2

Artificial Intelligence Key Laboratory of Sichuan Province, Sichuan University of Science and Engineering, Yibin 644000, People’s Republic of China

123

S. Gao et al. Fig. 1 An example of multi-label image

In fact, multi-label learning has a range of applications in the real world: for examples, an image can simultaneously contain a sandbeach, ship, human, cloud and sky, a text document may cover politics, economics, and the military, and so on [2–7]. An example of a multi-label image is presented in Fig. 1. As there may be numerous labels in a multi-label data set, and each instance generally only requires several labels, thereby multi-label data always face the serious problem of having an imbalance in the number of instances assigned to each class. Actually, class imbalance learning is a hot topic