A new framework of multi-objective evolutionary algorithms for feature selection and multi-label classification of video

  • PDF / 2,879,568 Bytes
  • 19 Pages / 595.276 x 790.866 pts Page_size
  • 72 Downloads / 169 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

A new framework of multi‑objective evolutionary algorithms for feature selection and multi‑label classification of video data Gizem Nur Karagoz1 · Adnan Yazici1,2 · Tansel Dokeroglu3   · Ahmet Cosar4 Received: 27 November 2019 / Accepted: 10 June 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract There are few studies in the literature to address the multi-objective multi-label feature selection for the classification of video data using evolutionary algorithms. Selecting the most appropriate subset of features is a significant problem while maintaining/improving the accuracy of the prediction results. This study proposes a framework of parallel multi-objective Non-dominated Sorting Genetic Algorithms (NSGA-II) for exploring a Pareto set of non-dominated solutions. The subsets of non-dominated features are extracted and validated by multi-label classification techniques, Binary Relevance (BR), Classifier Chains (CC), Pruned Sets (PS), and Random k-Labelset (RAkEL). Base classifiers such as Support Vector Machines (SVM), J48-Decision Tree (J48), and Logistic Regression (LR) are performed in the classification phase of the algorithms. Comprehensive experiments are carried out with local feature descriptors extracted from two multi-label data sets, the wellknown MIR-Flickr dataset and a Wireless Multimedia Sensor (WMS) dataset that we have generated from our video recordings. The prediction accuracy levels are improved by 6.36% and 25.7% for the MIR-Flickr and WMS datasets respectively while the number of features is significantly reduced. The results verify that the algorithms presented in this new framework outperform the state-of-the-art algorithms. Keywords  Multi-label classification · Multi-objective optimization · Evolutionary · Machine learning · Feature selection

1 Introduction We live in an era where computer systems produce very large amounts of data that must be processed to extract hidden knowledge. To make the best use of available computing * Tansel Dokeroglu [email protected] Gizem Nur Karagoz [email protected]

Adnan Yazici [email protected]; [email protected]

Ahmet Cosar [email protected] 1



Department of Computer Engineering, Middle East Technical University, Ankara, Turkey

2



Department of Computer Science, Nazarbayev University, Nur‑Sultan, Kazakhstan

3

Department of Computer Engineering, TED University, Ankara, Turkey

4

Department of Computer Engineering, Ankara Bilim University, Ankara, Turkey



resources, the processing time of big data needs to employ special data processing techniques. Some parts of the data may be contaminated and this can prevent the extraction of useful knowledge. Irrelevant and/or redundant data must be eliminated, preferably even before being transmitted to a big data store, to reduce the data processing load, to increase the classification accuracy and to obtain better data models. Complex data structures need to be designed for filtering out irrelevant data. Efficient data mining a