Flat random forest: a new ensemble learning method towards better training efficiency and adaptive model size to deep fo

PDF / 1,747,836 Bytes
13 Pages / 595.276 x 790.866 pts Page_size
47 Downloads / 149 Views

ORIGINAL ARTICLE

Flat random forest: a new ensemble learning method towards better training efficiency and adaptive model size to deep forest Peng Liu1 · Xuekui Wang2 · Liangfei Yin3 · Bing Liu4 Received: 20 May 2019 / Accepted: 2 May 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract The known deficiencies of deep neural networks include inferior training efficiency, weak parallelization capability, too many hyper-parameters etc. To address these issues, some researchers presented deep forest, a special deep learning model, which achieves some significant improvements but remain poor training efficiency, inflexible model size and weak interpretability. This paper endeavors to solve the issues in a new way. Firstly, deep forest is extended to the densely connected deep forest to enhance the prediction accuracy. Secondly, to perform parallel training with adaptive model size, the flat random forest is proposed by achieving the balance between the width and depth of densely connected deep forest. Finally, two core algorithms are respectively presented for the forward output weights computation and output weights updating. The experimental results show, compared with deep forest, the proposed flat random forest acquires competitive prediction accuracy, higher training efficiency, less hyper-parameters and adaptive model size. Keywords Ensemble learning · Flat random forest · Training efficiency · Size-adaptive model

1 Introduction As is known, ensemble learning is an effective learning method to improve the performance of individual models. Typical ensemble learning methods include unsupervised clustering algorithm [1], Bagging [2], Boosting [3],

XGBoost [4], Rule aggregation [5] and Random Forest (RF) [6]. Recently, Zhou et al. presented Deep Forest (gcForest) [7] as a counterpart of convolutional neural network (CNN). Compared with CNN, gcForest has some advantages, such as fewer hyper-parameters, shorter training time and lower computation cost. Nevertheless, gcForest still has the following inadequacies:

* Bing Liu [email protected]

1. Training Efficiency

Peng Liu [email protected]

In gcForest, the output of each RF layer is input to the next layer, which leads to poor parallelization between different layers during model training. What’s more, by means of multi-grained scanning (MGS) [7], gcForest only extracts the locally spatial input features (e.g. sequence data and image) for training each random forest. Thereby, each random forest needs to be trained for EACH scanning window of different sizes, which results in poor training efficiency.

Xuekui Wang xuekui.wxk@alibaba‑inc.com Liangfei Yin [email protected] 1

National Joint Engineering Laboratory of Internet Applied Technology of Mines, China University of Mining and Technology, Xuzhou 221008, Jiangsu, China

2

Alibaba Group, Hangzhou 311121, Zhejiang, China

3

School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China

4

School of Computer S

Data Loading...

Flat random forest: a new ensemble learning method towards better training efficiency and adaptive model size to deep fo

Recommend Documents

Towards Applying Deep Learning to the Internet of Things: A Model and a Framework

Deep convolutional neural network with new training method and transfer learning for structural fault classification of

A multi-layer and multi-ensemble stock trader using deep learning and deep reinforcement learning

Ranger Random Forest-Based Efficient Ensemble Learning Approach for Detecting Malicious URLs

Improved Random Forest Algorithm Based on Adaptive Step Size Artificial Bee Colony Optimization

An improved model training method for residual convolutional neural networks in deep learning

Adaptive Ensemble Variants of Random Vector Functional Link Networks

Deep learning model with ensemble techniques to compute the secondary structure of proteins

Scandent Tree: A Random Forest Learning Method for Incomplete Multimodal Datasets

Deep Learning Pipeline Building a Deep Learning Model with TensorFlo

Ensemble learning based on random super-reduct and resampling

Improving Efficiency of Web Application Firewall to Detect Code Injection Attacks with Random Forest Method and Analysis