A strong intuitionistic fuzzy feature association map-based feature selection technique for high-dimensional data
- PDF / 1,081,957 Bytes
- 9 Pages / 595.276 x 790.866 pts Page_size
- 35 Downloads / 189 Views
Sådhanå (2020) 45:242 https://doi.org/10.1007/s12046-020-01475-2
Sadhana(0123456789().,-volV)FT3](012345 6789().,-volV)
A strong intuitionistic fuzzy feature association map-based feature selection technique for high-dimensional data AMIT KUMAR DAS1, SAPTARSI GOSWAMI2, AMLAN CHAKRABARTI1 and BASABI CHAKRABORTI3,* 1
A. K. Choudhury School of Information Technology, University of Calcutta, Kolkata, India Bangabasi Morning College, University of Calcutta, Kolkata, India 3 Iwate Prefectural University, Takizawa, Japan e-mail: [email protected]; [email protected]; [email protected]; [email protected] 2
MS received 27 February 2020; revised 19 July 2020; accepted 25 July 2020 Abstract. In this work, a graph-based approach has been adopted for feature selection in case of highdimensional data. Feature selection intends to identify an optimal feature subset to solve the given learning problem. In an optimal feature subset, only relevant features are selected as ‘‘members’’ and features that have redundancy are considered as ‘‘non-members’’. This concept of ‘‘membership’’ and ‘‘non-membership’’ of a feature to an optimal feature subset has been represented by a strong intuitionistic fuzzy graph. The algorithm proposed in this work at first maps the feature set of the data as the vertex set of a strong intuitionistic fuzzy graph. Then the association between features represented as an edge-set is decided by the degree of hesitation between the features. Based on the feature association, the Strong Intuitionistic Fuzzy Feature Association Map (SIFFAM) is developed for the datasets. Then a sub-graph of SIFFAM is derived to identify features with maximal non-redundancy and relevance. Finally, the SIFFAM based feature selection algorithm is applied on very high dimensional datasets having features of the order of thousand. Empirically, the proposed approach SIFFAM based feature selection algorithm is found to be competitive with several benchmark feature selection algorithms in the context of high-dimensional data. Keywords. datasets.
Feature selection; strong intuitionistic fuzzy graph; mutual information; high-dimensional
1. Introduction Digital world is moving towards a new paradigm. In place of the traditional business-driven decision-making, it is heading towards data-driven decision-making. Machine learning is extensively used to generate knowledge from the past data. In machine learning, based on the input dataset, the learning models are trained. Couple of decades back, the dimension of the input datasets used to be less than 100 for most of the application domains. In late 1990s, very few domains had more than 40 features [1]. However, over the last two decades, the data storage capacity is getting multiplied over the years. So the volume as well as dimension of input data fed into the machine learning models is also getting multiplied. From that time when machine learning models had to deal with less than 100 features, we have moved to a time when models are dealing with very highdimensional data
Data Loading...