A new feature selection using dynamic interaction

PDF / 1,533,839 Bytes
13 Pages / 595.276 x 790.866 pts Page_size
68 Downloads / 284 Views

THEORETICAL ADVANCES

A new feature selection using dynamic interaction Zhang Li1,2 Received: 13 May 2019 / Accepted: 9 September 2020 © Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract With the continuous development of Internet technology, data gradually present a complicated and high-dimensional trend. These high-dimensional data have a large number of redundant features and irrelevant features, which bring great challenges to the existing machine learning algorithms. Feature selection is one of the important research topics in the fields of machine learning, pattern recognition and data mining, and it is also an important means in the data preprocessing stage. Feature selection is to look for the optimal feature subset from the original feature set, which would improve the classification accuracy and reduce the machine learning time. The traditional feature selection algorithm tends to ignore the kind of feature which has a weak distinguishing capacity as a monomer, whereas the feature group’s distinguishing capacity is strong. Therefore, a new dynamic interaction feature selection (DIFS) algorithm is proposed in this paper. Initially, under the theoretical framework of interactive information, it redefines the relevance, irrelevance and redundancy of the features. Secondly, it offers the computational formulas for calculating interactive information. Finally, under the eleven data sets of UCI and three different classifiers, namely, KNN, SVM and C4.5, the DIFS algorithm increases the classification accuracy of the FullSet by 3.2848% and averagely decreases the number of features selected by 15.137. Hence, the DIFS algorithm can not only identify the relevance feature effectively, but also identify the irrelevant and redundant features. Moreover, it can effectively improve the classification accuracy of the data sets and reduce the feature dimensions of the data sets. Keywords Feature selection · Feature interaction · Feature relevance · Feature redundancy · Filter method

1 Introduction In the field of machine learning and pattern recognition [1], feature selection which is based on disaggregated models has aroused wide attention of so many researchers. It aims to seek the optimal feature subset from the original data set, and the feature subset is able to stand for the original data set. The advantages [2] lie in that it can reduce the machine learning time, avoid overfitting, cut down the physical storage of the data set and increase the classification accuracy of the algorithm. From the point of view of subset evaluation function, feature selection algorithms can be divided into three categories [3]: embedded, wrapper and filtering * Zhang Li zhangli_3913@163.com 1

School of Computer Engineering, Jiangsu University of Technology, Changzhou 213001, Jiangsu, China

Key Laboratory of Trustworthy Distributed Computing and Service (Ministry of Education), Beijing University of Posts and Telecommunications, Beijing 100876, China

2

methods. Embedded methods and wrapper methods usually

Data Loading...

A new feature selection using dynamic interaction

Recommend Documents

A new hybrid stability measure for feature selection

A new ensemble feature selection approach based on genetic algorithm

A new BAT optimization algorithm based feature selection method for electrocardiogram heartbeat classification using emp

A Neuroevolutionary Approach to Feature Selection Using Multiobjective Evolutionary Algorithms

Feature Selection Optimization Using a Hybrid Genetic Algorithm

A Differential Evolution Approach to Feature Selection and Instance Selection

Feature Selection for Clustering

Local Feature Selection

A feature selection algorithm based on redundancy analysis and interaction weight

ROC with Cost Pareto Frontier Feature Selection Using Search Methods

Machine Learning and Feature Selection Based Ransomware Detection Using Hexacodes

Feature Selection for Handwritten Signature Recognition Using Neighborhood Component Analysis