Learning optimal decision trees using constraint programming
- PDF / 1,933,390 Bytes
- 25 Pages / 439.642 x 666.49 pts Page_size
- 34 Downloads / 306 Views
Learning optimal decision trees using constraint programming ´ ene ` Verhaeghe1 Hel Pierre Schaus1
· Siegfried Nijssen1 · Gilles Pesant2 · Claude-Guy Quimper3 ·
Accepted: 29 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Decision trees are among the most popular classification models in machine learning. Traditionally, they are learned using greedy algorithms. However, such algorithms pose several disadvantages: it is difficult to limit the size of the decision trees while maintaining a good classification accuracy, and it is hard to impose additional constraints on the models that are learned. For these reasons, there has been a recent interest in exact and flexible algorithms for learning decision trees. In this paper, we introduce a new approach to learn decision trees using constraint programming. Compared to earlier approaches, we show that our approach obtains better performance, while still being sufficiently flexible to allow for the inclusion of constraints. Our approach builds on three key building blocks: (1) the use of AND/OR search, (2) the use of caching, (3) the use of the CoverSize global constraint proposed recently for the problem of itemset mining. This allows our constraint programming approach to deal in a much more efficient way with the decompositions in the learning problem. Keywords Decision Tree · CoverSize · AND/OR search tree · Caching
H´el`ene Verhaeghe
[email protected] Siegfried Nijssen [email protected] Gilles Pesant [email protected] Claude-Guy Quimper [email protected] Pierre Schaus [email protected] 1
UCLouvain, ICTEAM, Place Sainte Barbe 2, 1348 Louvain-la-Neuve, Belgium
2
Polytechnique Montr´eal, Montr´eal, Canada
3
Universit´e Laval, Qu´ebec, Canada
Constraints
1 Introduction Decision trees are popular classification models in machine learning. Benefits of decision trees include that they are relatively easy to interpret and that they provide good classification performance on many datasets. Several methods have been proposed in the literature for learning decision trees. The greedy methods are the most popular ones [1–3]. These methods recursively partition a dataset into two subsets based on a greedily selected attribute until some stopping criterion is reached (such as a minimum number of examples in the leaf, or a unique class label in these examples). While in practice these methods obtain a good prediction accuracy for many types of data, unfortunately, they provide little guarantees. As a result, the trees learned using these methods may be unnecessarily complex, may be less accurate than possible, and it is hard to impose additional constraints on the trees, such as on the fairness of their predictions. To address these weaknesses, researchers have studied the inference of optimal decision trees under constraints [4–10].1 These approaches ensure that under well-defined constraints and optimization criteria, an optimal tree is found. Experiments conducted
Data Loading...