An extensive experimental evaluation of automated machine learning methods for recommending classification algorithms
- PDF / 1,142,864 Bytes
- 20 Pages / 595.276 x 790.866 pts Page_size
- 113 Downloads / 209 Views
RESEARCH PAPER
An extensive experimental evaluation of automated machine learning methods for recommending classification algorithms M. P. Basgalupp1 · R. C. Barros2 · A. G. C. de Sá1 · G. L. Pappa1 · R. G. Mantovani3 · A. C. P. L. F. de Carvalho3 · A. A. Freitas4 Received: 9 April 2020 / Accepted: 1 July 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract This paper presents an experimental comparison among four automated machine learning (AutoML) methods for recommending the best classification algorithm for a given input dataset. Three of these methods are based on evolutionary algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method based on the combined algorithm selection and hyperparameter optimisation (CASH) approach. The EA-based methods build classification algorithms from a single machine learning paradigm: either decision-tree induction, rule induction, or Bayesian network classification. Auto-WEKA combines algorithm selection and hyper-parameter optimisation to recommend classification algorithms from multiple paradigms. We performed controlled experiments where these four AutoML methods were given the same runtime limit for different values of this limit. In general, the difference in predictive accuracy of the three best AutoML methods was not statistically significant. However, the EA evolving decision-tree induction algorithms has the advantage of producing algorithms that generate interpretable classification models and that are more scalable to large datasets, by comparison with many algorithms from other learning paradigms that can be recommended by Auto-WEKA. We also observed that Auto-WEKA has shown meta-overfitting, a form of overfitting at the meta-learning level, rather than at the base-learning level. Keywords Evolutionary algorithms · Algorithm recommendation · Automated machine learning · Classification · Metalearning * M. P. Basgalupp [email protected]
1 Introduction
R. C. Barros [email protected]
Classification is one of the main machine learning tasks and, hence, there is a large variety of classification algorithms available [1, 2]. However, in most real-world applications, the choice of classification algorithm for a new dataset or application domain is still mainly an ad-hoc decision. In this context, the use of meta-learning for algorithm recommendation is a very important research area with seminal work dating back more than 20 years, which includes the StatLog [3] and METAL [4] projects. Meta-learning can be defined as learning how to learn, which involves learning, from previous experience, what is the best machine learning algorithm (and its best hyper-parameter setting) for a given dataset [5, 6]. Meta-learning systems for algorithm recommendation can be divided into two broad groups, namely: (a) systems that perform algorithm selection based on metafeatures [5], which is the most investigated type; and (b) systems that search for the best possible classification algorithm in a given algorithm space [7].
A. G. C. de S
Data Loading...