Feature Selection Algorithms and Student Academic Performance: A Study

In the present state of affairs, the motive behind every educational organization is to uplift the academic achievement of students. Educational data mining (EDM) is an upward field of research, and it is very helpful for academic institutions to predict

  • PDF / 625,082 Bytes
  • 12 Pages / 439.37 x 666.142 pts Page_size
  • 100 Downloads / 225 Views

DOWNLOAD

REPORT


Abstract In the present state of affairs, the motive behind every educational organization is to uplift the academic achievement of students. Educational data mining (EDM) is an upward field of research, and it is very helpful for academic institutions to predict the academic performance of the students. Educational datasets are the basis of various predictive models. Quality of these models can be improved by using feature selection (FS). To get the required benefits from the available data, there must be some tools for analysis and prediction. In lieu of the above, machine learning/data mining are most suitable. In educational data mining, for better accuracy of prediction models’ and quality of various educational datasets, feature selection (FS) plays a vital role. Feature selection (FS) algorithms abolish inappropriate information from the repositories of educational background so that performance of classifier in terms of accuracy could be increased and the same could be used for better decision. In lieu of the above, a best feature selection algorithm must be selected. In this paper, two filter selection approaches namely correlation feature selection (CFS) and wrapper-based feature selection have been used to demonstrate the importance of selection of a feature subset for a classification problem. The present paper aims to find the detailed investigation of filter feature selection algorithms along with the classification algorithms on a given dataset. We found result with numerous numbers of features from various Feature selection algorithms and classifiers which will help the researcher to discover the most excellent mixture of filter feature selection algorithms and its associated classifiers. The result indicates that SMO and J48 have the highest accuracy measures with the correlation feature selection algorithms, while Naïve Bayes has the highest accuracy measures with the wrapper subset feature selection algorithms for predicting high, medium and low grade for the students.

C. Jalota (B) · R. Agrawal Faculty of Computer Applications, Manav Rachna International Institute of Research and Studies, Faridabad, India e-mail: [email protected] R. Agrawal e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 D. Gupta et al. (eds.), International Conference on Innovative Computing and Communications, Advances in Intelligent Systems and Computing 1165, https://doi.org/10.1007/978-981-15-5113-0_23

317

318

C. Jalota and R. Agrawal

Keywords Feature selection algorithms · J48 · Random forest · Correlation feature selection · Wrapper feature selection

1 Introduction For the development of any nation, the key factor is the education. Most required ingredient to make new changes to the society is the quality of education. To improve the educational process, we have to explore the hidden information from this huge data which is kept in academic institutions databases. Student’s academic performance can be evaluated by many techniques like data mining and machine learning. With the hel