Improving deep forest by ensemble pruning based on feature vectorization and quantum walks

  • PDF / 1,449,122 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 28 Downloads / 157 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789().,-volV)

METHODOLOGIES AND APPLICATION

Improving deep forest by ensemble pruning based on feature vectorization and quantum walks Jie Gao1 • Kunhong Liu1 • Beizhan Wang1 • Dong Wang2 • Xiaoyan Zhang3

 Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Recently, a deep learning model, the deep forest (DF), was designed as an alternative to deep neural networks. Each cascade layer of the DF contains a set of random forests (RFs) with a large number of decision trees, some of which are of high redundancy and poor performance. To avoid the negative impacts of such decision trees, this paper proposes to optimize RFs in each cascade layer of the DF so as to realize a pruned deep forest (PDF) with higher performance and smaller ensemble size. In this paper, a new ordering-based ensemble pruning method is proposed based on feature vectorization and quantum walks. This method simultaneously considers the accuracy and the diversity of base classifiers, and it provides an integrated evaluation criterion for ordering base classifiers in the ensemble system. The effectiveness of the proposed method is verified by experiments and discussions. Keywords Deep forest  Ensemble pruning  Feature vectorization  Quantum walks

1 Introduction Ensemble learning is a branch of machine learning that trains a group of base classifiers and then combines them to produce improved results. An individual base classifier can learn only one hypothesis from the raw data, whereas an ensemble system with multiple base classifiers can learn multiple hypotheses. In a well-constructed ensemble system, the base classifiers should be as accurate and diverse as possible, so that the combination of these base classifiers can compensate for the prediction error of a single base classifier through the voting mechanism. The ensemble system generally performs much better than a single base

Communicated by V. Loia. & Beizhan Wang [email protected] Kunhong Liu [email protected] 1

School of Informatics, Xiamen University, Xiamen 361005, China

2

State Grid Fujian Electric Power Company, Fuzhou 350003, China

3

Xiamen University Tan Kah Kee College, Xiamen 363105, China

classifier (Guo et al. 2018). However, the ensemble system contains three obvious flaws: the use of multiple base classifiers incurs a vast computational cost; base classifiers with poor prediction performance negatively affect the ensemble system; and base classifiers with high similarity introduce redundancy into the ensemble system. Therefore, to improve the performance of an ensemble system, it is necessary to remove poorly performing and highly similar base classifiers; this is known as ensemble pruning. Many studies indicate that selecting an optimal sub-ensemble to replace the original ensemble can improve ensemble performance and reduce ensemble size (Dai et al. 2017; Guo et al. 2018). In other words, ensemble pruning mainly focuses on selecting base classifiers with high accuracy and good diversity from the original ensemble. Man