Embedding Decision Trees and Random Forests in Constraint Programming

In past papers, we have introduced Empirical Model Learning (EML) as a method to enable Combinatorial Optimization on real world systems that are impervious to classical modeling approaches. The core idea in EML consists in embedding a Machine Learning mo

PDF / 292,061 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
89 Downloads / 306 Views

DOWNLOAD

REPORT

Abstract. In past papers, we have introduced Empirical Model Learning (EML) as a method to enable Combinatorial Optimization on real world systems that are impervious to classical modeling approaches. The core idea in EML consists in embedding a Machine Learning model in a traditional combinatorial model. So far, the method has been demonstrated by using Neural Networks and Constraint Programming (CP). In this paper we add one more technique to the EML arsenal, by devising methods to embed Decision Trees (DTs) in CP. In particular, we propose three approaches: 1) a simple encoding based on meta-constraints; 2) a method using attribute discretization and a global table constraint; 3) an approach based on converting a DT into a Multi-valued Decision Diagram, which is then fed to an mdd constraint. We ﬁnally show how to embed in CP a Random Forest, a powerful type of ensemble classiﬁer based on DTs. The proposed methods are compared in an experimental evaluation, highlighting their strengths and their weaknesses.

1

Introduction

Combinatorial Optimization methods have been successfully applied to a broad range of industrial problems. Many of such approaches rely on the availability of some declarative model describing decisions, constraints on these decisions, their cost and their impact on the considered system. In short, they rely on an accurate problem model. However, in some application domains, the model is either not fully known, or it is described in a way that is not useful for combinatorial optimization. As an example, for many domains there are predictive models to forecast the temporal dynamic of a target system via diﬀerential equations, but those are unfortunately impossible to insert into a combinatorial optimization model without incurring in computational issues. In these cases, it is likely that the domain expert proposes some heuristic knowledge on the problem that is a (non measurable) approximation of the eﬀect that some decisions have on the system dynamic. We propose here an alternative approach, which is an instantiation of a general method called Empirical Model Learning that we proposed in [2]. We aim at learning part of the combinatorial optimization model and to embed the learned piece of knowledge in the combinatorial model itself. In this way, we have two advantages: the ﬁrst is that we c Springer International Publishing Switzerland 2015 L. Michel (Ed.): CPAIOR 2015, LNCS 9075, pp. 74–90, 2015. DOI: 10.1007/978-3-319-18008-3 6

Embedding Decision Trees and Random Forests in Constraint Programming

75

can use this knowledge to reduce the search tree and the second is that we know the accuracy that we have obtained in the process. In this paper, we consider the problem of embedding in a CP model two types of tree-based classiﬁers from Machine Learning, namely Decision Trees and Random Forests. Formally, a classiﬁer is a function f mapping a tuple of values for a set of discrete or numeric attributes to a discrete class. We can embed a classiﬁer in CP by introducing a vector of variables

Data Loading...

Embedding Decision Trees and Random Forests in Constraint Programming

Recommend Documents

Learning optimal decision trees using constraint programming

Bagging Trees and Random Forests

Generating Random Logic Programs Using Constraint Programming

Constraint Programming and Decision Making: Theory and Applications

Forests, Trees and Human Health

Skin lesion classification using decision trees and random forest algorithms

Constraint programming

Random Forests

Random Forests

Constraint Programming

Decision Trees and Applications

Decision Trees