Item Response Theory Based Ensemble in Machine Learning

  • PDF / 4,329,593 Bytes
  • 16 Pages / 595.26 x 841.82 pts (A4) Page_size
  • 64 Downloads / 183 Views

DOWNLOAD

REPORT


Response Theory Based Ensemble in Machine Learning Ziheng Chen          Hongshik Ahn Department of Applied Mathematics and Statistics, Stony Brook University, New York 11794−3600, USA

  Abstract:   In this article, we propose a novel probabilistic framework to improve the accuracy of a weighted majority voting algorithm. In order to assign higher weights to the classifiers which can correctly classify hard-to-classify instances, we introduce the item response theory (IRT) framework to evaluate the samples′ difficulty and classifiers′ ability simultaneously. We assigned the weights to classifiers based on their abilities. Three models are created with different assumptions suitable for different cases. When making an inference, we keep a balance between the accuracy and complexity. In our experiment, all the base models are constructed by single trees via bootstrap. To explain the models, we illustrate how the IRT ensemble model constructs the classifying boundary. We also compare their performance with other widely used methods and show that our model performs well on 19 datasets. Keywords:   Classification, ensemble learning, item response theory, machine learning, expectation maximization (EM) algorithm.

 

1 Introduction Classification ensembles are increasingly gaining attention from the area of machine learning, especially when we focus on improving the accuracy. The most important feature distinguishing the ensemble learning from other types of learning is that it combines the predictions from a group of classifiers rather than depending on a single classifier[1]. It is proved in many cases that the aggregated performance metrics, such as bagging, boosting and incremental learning outperform others without a collective decision strategy. If one had to identify an idea as central and novel to ensemble learning, it is the combination rule, which can be characterized in two ways: simple majority voting and weighted majority voting. Simple majority voting is just a decision rule which combines the decisions of the classifiers in the ensemble[1]. It is widely applied in ensemble learning due to its simplicity and applicability[2]. Weighted majority voting can be done by multiplying a weight to the decision of each classifier to reflect its ability, and then make the final decision by combining the weighted decisions[3]. These two methods utilize the ability of classifiers based on their performance on training the data. Thus it does not require any parameter tuning once the individual classifiers have been trained. Here we propose a novel probabilistic framework for the weighted voting classification ensemble. We treat each data point as a problem and different classifier as a   Research Article Manuscript received February 16, 2020; accepted June 4, 2020 Recommended by Associate Editor Matjaz Gams ©  Institute  of  Automation,  Chinese  Academy  of  Sciences and Springer-Verlag GmbH Germany, part of Springer Nature 2020

 

 

 

subject taking an exam in class. As we know, the performance of one student on a pro