Mapping Ensembles of Trees to Sparse, Interpretable Multilayer Perceptron Networks

  • PDF / 2,779,592 Bytes
  • 15 Pages / 595.276 x 790.866 pts Page_size
  • 80 Downloads / 232 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH

Mapping Ensembles of Trees to Sparse, Interpretable Multilayer Perceptron Networks Dalia Rodríguez‑Salas1   · Nina Mürschberger1 · Nishant Ravikumar2 · Mathias Seuret1 · Andreas Maier1 Received: 30 April 2020 / Accepted: 21 July 2020 © The Author(s) 2020

Abstract Tree-based classifiers provide easy-to-understand outputs. Artificial neural networks (ANN) commonly outperform tree-based classifiers; nevertheless, understanding their outputs requires specialized knowledge in most cases. The highly redundant architecture of ANN is typically designed through an expensive trial-and-error scheme. We aim at (1) investigating whether using ensembles of decision trees to design the architecture of low-redundant, sparse ANN provides better-performing networks, and (2) evaluating whether such trees can be used to provide human-understandable explanations for their outputs. Information about the hierarchy of the features, and how good they are at separating subsets of samples among the classes, is gathered from each branch in an ensemble of trees. This information is used to design the architecture of a sparse multilayer perceptron network. Networks built using our method are called ForestNet. Tree branches corresponding to highly activated neurons are used to provide explanations of the networks’ outputs. ForestNets are able to handle low- and high-dimensional data, as we show on an evaluation using four datasets. Our networks consistently outperformed their respective ensemble of trees and had similar performance to their fully connected counterparts with a significant reduction of connections. Furthermore, our interpretation method seems to provide support for the ForestNet outputs. While ForestNet’s architectures do not allow them yet to capture well the intrinsic variability of visual data, they exhibit very promising results by reducing more than 98% of connections for such visual tasks. Structure similarities between ForestNets and their respective tree ensemble provide means to interpret their outputs. Keywords  Multilayer perceptron · Random forest · Interpretable models · Sparse neural networks · Network architecture · Interpretability This article is part of the topical collection “Machine Learning in Pattern Analysis” guest edited by Reinhard Klette, Brendan McCane, Gabriella Sanniti di Baja, Palaiahnakote Shivakumara, and Liang Wang. * Dalia Rodríguez‑Salas [email protected] Nina Mürschberger [email protected] Nishant Ravikumar [email protected] Mathias Seuret [email protected] Andreas Maier [email protected] 1



Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, Martensstr. 3, 91058 Erlangen, Germany



School of Computing, University of Leeds, Leeds LS2 9JT, UK

2

Introduction Multilayer perceptron (MLP) networks have been successfully used for solving many complicated learning tasks, whether they involve visual or non-visual data. MLP have been first used mainly as classifiers, and the traditional methodologies involving them are mainly ba