Treant : training evasion-aware decision trees

  • PDF / 1,548,274 Bytes
  • 31 Pages / 439.37 x 666.142 pts Page_size
  • 47 Downloads / 231 Views

DOWNLOAD

REPORT


T REANT: training evasion-aware decision trees Stefano Calzavara1 · Claudio Lucchese1 · Gabriele Tolomei2 · Seyum Assefa Abebe1 · Salvatore Orlando1 Received: 5 September 2019 / Accepted: 19 May 2020 © The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2020

Abstract Despite its success and popularity, machine learning is now recognized as vulnerable to evasion attacks, i.e., carefully crafted perturbations of test inputs designed to force prediction errors. In this paper we focus on evasion attacks against decision tree ensembles, which are among the most successful predictive models for dealing with non-perceptual problems. Even though they are powerful and interpretable, decision tree ensembles have received only limited attention by the security and machine learning communities so far, leading to a sub-optimal state of the art for adversarial learning techniques. We thus propose Treant, a novel decision tree learning algorithm that, on the basis of a formal threat model, minimizes an evasion-aware loss function at each step of the tree construction. Treant is based on two key technical ingredients: robust splitting and attack invariance, which jointly guarantee the soundness of the learning process. Experimental results on publicly available datasets show that Treant is able to generate decision tree ensembles that are at the same time accurate and nearly insensitive to evasion attacks, outperforming state-of-the-art adversarial learning techniques. Keywords Adversarial machine learning · Robust learning · Decision tree ensembles

1 Introduction Machine Learning (ML) is increasingly used in several applications and different contexts. When ML is leveraged to ensure system security, such as in spam filtering and intrusion detection, everybody acknowledges the need of training ML models

Responsible editor: Ira Assent, Carlotta Domeniconi, Aristides Gionis, Eyke Hüllermeier.

B

Claudio Lucchese [email protected]

1

Università Ca’ Foscari Venezia, Venice, Italy

2

Sapienza Università di Roma, Roma, Italy

123

S. Calzavara et al.

resilient to adversarial manipulations (Huang et al. 2011; Biggio and Roli 2018). Yet the same applies to other critical application scenarios in which ML is now employed, where adversaries may cause severe system malfunctioning or faults. For example, consider an ML model which is used by a bank to grant loans to inquiring customers: a malicious customer may try to fool the model into illicitly qualifying him for a loan. Unfortunately, traditional ML algorithms proved vulnerable to a wide range of attacks, and in particular to evasion attacks, i.e., carefully crafted perturbations of test inputs designed to force prediction errors (Biggio et al. 2013; Nguyen et al. 2015; Papernot et al. 2016a; Moosavi-Dezfooli et al. 2016). To date, research on evasion attacks has mostly focused on linear classifiers (Lowd and Meek 2005; Biggio et al. 2011) and, more recently, on deep neural networks (Szegedy et al. 2014; Goodfellow et al.