Flat random forest: a new ensemble learning method towards better training efficiency and adaptive model size to deep fo

  • PDF / 1,747,836 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 47 Downloads / 142 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Flat random forest: a new ensemble learning method towards better training efficiency and adaptive model size to deep forest Peng Liu1 · Xuekui Wang2 · Liangfei Yin3 · Bing Liu4  Received: 20 May 2019 / Accepted: 2 May 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract The known deficiencies of deep neural networks include inferior training efficiency, weak parallelization capability, too many hyper-parameters etc. To address these issues, some researchers presented deep forest, a special deep learning model, which achieves some significant improvements but remain poor training efficiency, inflexible model size and weak interpretability. This paper endeavors to solve the issues in a new way. Firstly, deep forest is extended to the densely connected deep forest to enhance the prediction accuracy. Secondly, to perform parallel training with adaptive model size, the flat random forest is proposed by achieving the balance between the width and depth of densely connected deep forest. Finally, two core algorithms are respectively presented for the forward output weights computation and output weights updating. The experimental results show, compared with deep forest, the proposed flat random forest acquires competitive prediction accuracy, higher training efficiency, less hyper-parameters and adaptive model size. Keywords  Ensemble learning · Flat random forest · Training efficiency · Size-adaptive model

1 Introduction As is known, ensemble learning is an effective learning method to improve the performance of individual models. Typical ensemble learning methods include unsupervised clustering algorithm [1], Bagging [2], Boosting [3],

XGBoost [4], Rule aggregation [5] and Random Forest (RF) [6]. Recently, Zhou et al. presented Deep Forest (gcForest) [7] as a counterpart of convolutional neural network (CNN). Compared with CNN, gcForest has some advantages, such as fewer hyper-parameters, shorter training time and lower computation cost. Nevertheless, gcForest still has the following inadequacies:

* Bing Liu [email protected]

1. Training Efficiency

Peng Liu [email protected]

In gcForest, the output of each RF layer is input to the next layer, which leads to poor parallelization between different layers during model training. What’s more, by means of multi-grained scanning (MGS) [7], gcForest only extracts the locally spatial input features (e.g. sequence data and image) for training each random forest. Thereby, each random forest needs to be trained for EACH scanning window of different sizes, which results in poor training efficiency.

Xuekui Wang xuekui.wxk@alibaba‑inc.com Liangfei Yin [email protected] 1



National Joint Engineering Laboratory of Internet Applied Technology of Mines, China University of Mining and Technology, Xuzhou 221008, Jiangsu, China

2



Alibaba Group, Hangzhou 311121, Zhejiang, China

3

School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China



4



School of Computer S