Selective Cascade of Residual ExtraTrees

  • PDF / 3,727,735 Bytes
  • 11 Pages / 595.276 x 790.866 pts Page_size
  • 2 Downloads / 202 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH

Selective Cascade of Residual ExtraTrees Qimin Liu1   · Fang Liu2 Received: 18 May 2020 / Accepted: 30 September 2020 © Springer Nature Singapore Pte Ltd 2020

Abstract We propose a novel tree-based ensemble method named Selective Cascade of Residual ExtraTrees (SCORE). SCORE draws inspiration from representation learning, incorporates regularized regression with variable selection features, and utilizes boosting to improve prediction and reduce generalization errors. We also develop a variable importance measure to increase the explainability of SCORE. Our computer experiments show that SCORE provides comparable or superior performance in prediction against ExtraTrees, random forest, gradient boosting machine, and neural networks; and the proposed variable importance measure for SCORE is comparable to studied benchmark methods. Finally, the predictive performance of SCORE remains stable across hyper-parameter values, suggesting potential robustness to hyper-parameter specification. Keywords  Extremely randomized trees · Boosting · Ensemble learning · Regularized regression · Variable importance measure · Explainability

Introduction Ensemble learning methods combine multiple learning algorithms to achieve better learning performance than that from individual learning algorithms. The success of ensemble methods, in part, can be attributed to the diversity among the constituent members, which helps mitigate over-fitting and reduce the generalization error [1]. We focus on tree-based ensemble methods for regression. Tree-based ensemble methods construct more than one decision tree and, by appreciating the diversity among the trees, increase the generalizability of the ensemble. Such diversity often comes from perturbation in the optimization process of individual trees. For example, tree bagging generates bootstrap samples, from which decision trees are obtained and averaged [2]. Random subspace selects a pseudo-random subset of features when building trees [3]. Random forest (RF) selects a random subset of the features for each * Qimin Liu [email protected] Fang Liu [email protected] 1



Department of Psychology and Human Development, Vanderbilt University, Nashville, TN, USA



Department of Applied and Computational Mathematics & Statistics, University of Notre Dame, Notre Dame, IN, USA

2

candidate split, and trains the trees with bootstrap samples [4]. The computational costs increase drastically as the number of trees increase in tree-based ensemble approaches given both the optimization and the perturbation processes. One way to reduce the computational burden is to replace optimization with randomization processes. C4.5, for example, randomly chooses from the best 20 splits in constructing individual trees [1]; extremely randomized trees (ExtraTrees) randomly selects cut-point for each candidate split variable [5]; and [6] proposes “randomly” choosing a non-tested feature at each level of the tree without using any training data. However, the extreme randomness and lack of optimizatio