Feature selection with multi-objective genetic algorithm based on a hybrid filter and the symmetrical complementary coef

PDF / 1,074,087 Bytes
18 Pages / 595.224 x 790.955 pts Page_size
40 Downloads / 217 Views

Feature selection with multi-objective genetic algorithm based on a hybrid ﬁlter and the symmetrical complementary coeﬃcient Rui Zhang1 · Zuoquan Zhang1

· Di Wang1 · Marui Du1

Accepted: 16 October 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract With the expansion of data size and data dimension, feature selection attracts more and more attention. In this paper, we propose a novel feature selection algorithm, namely, Hybrid filter and Symmetrical Complementary Coefficient based Multi-Objective Genetic Algorithm feature selection (HSMOGA). HSMOGA contains a new hybrid filter, Symmetrical Complementary Coefficient which is a well-performed metric of feature interactions proposed recently, and a novel way to limit feature subset’s size. A new Pareto-based ranking function is proposed when solving multi-objective problems. Besides, HSMOGA starts with a novel step called knowledge reserve, which precalculate the knowledge required for fitness function calculation and initial population generation. In this way, HSMOGA is classifier-independent in each generation, and its initial population generation makes full use of the knowledge of data set which makes solutions converge faster. Compared with other GA-based feature selection methods, HSMOGA has a much lower time complexity. According to experimental results, HSMOGA outperforms other nine state-of-art feature selection algorithms including five classic and four more recent algorithms in terms of kappa coefficient, accuracy, and G-mean for the data sets tested. Keywords Feature selection · Feature interaction · Hybrid filter · Symmetrical complementary coefficient · Multi-objective genetic algorithm

1 Introduction Feature selection plays an important role in many aspects of machine learning such as multivariate classification (including the binary classification) where each instance has just one class, and the multi-label classification [18, 19] where there is more than one class variable or each instance can belong to multiple classes at the same time, and This work is supported by the National Natural Science Foundation of China under Grant 51727813. Zuoquan Zhang

[email protected] Rui Zhang [email protected] Di Wang [email protected] Marui Du [email protected] 1

School of Science, Beijing Jiaotong University, Beijing, China

sometimes there are dependencies between these classes. These classification tasks all need to learn the input data, and the features are used to characterize the data from different perspectives. Whereas, for a data set, sometimes many features of it are not helpful for learning and mining tasks, or even harmful [20]. Thus, correct features are essential. Nowadays, there is a growing requirement of feature selection, as data sets are getting bigger and wider.

1.1 Literature review Features which need to be coped with can be divided into three types, i.e., irrelevant features, redundant features and interactive features. Irrelevant feature refers to the one which does not help with learning an

Data Loading...

Feature selection with multi-objective genetic algorithm based on a hybrid filter and the symmetrical complementary coef

Recommend Documents

Feature Selection Optimization Using a Hybrid Genetic Algorithm

A new ensemble feature selection approach based on genetic algorithm

Hybrid Efficient Genetic Algorithm for Big Data Feature Selection Problems

A hybrid feature selection approach based on improved PSO and filter approaches for image steganalysis

A parallel hybrid krill herd algorithm for feature selection

Extreme Algorithm Selection with Dyadic Feature Representation

A Robust Method for Multi-algorithmic Palmprint Recognition Using Exponential Genetic Algorithm-Based Feature Selection

A Neuroevolutionary Approach to Feature Selection Using Multiobjective Evolutionary Algorithms

A GA-Based Feature Selection Algorithm for Remote Sensing Images

Optimal Reservoir Optimization Using Multiobjective Genetic Algorithm

Template Selection for Lookup Table Based on Genetic Algorithm

Application of hybrid forecast engine based intelligent algorithm and feature selection for wind signal prediction