Handling concept drift via model reuse

  • PDF / 855,783 Bytes
  • 36 Pages / 439.37 x 666.142 pts Page_size
  • 69 Downloads / 222 Views

DOWNLOAD

REPORT


Handling concept drift via model reuse Peng Zhao1

· Le-Wen Cai1 · Zhi-Hua Zhou1

Received: 2 May 2019 / Revised: 16 July 2019 / Accepted: 6 September 2019 © The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019

Abstract In many real-world applications, data are often collected in the form of a stream, and thus the distribution usually changes in nature, which is referred to as concept drift in the literature. We propose a novel and effective approach to handle concept drift via model reuse, that is, reusing models trained on previous data to tackle the changes. Each model is associated with a weight representing its reusability towards current data, and the weight is adaptively adjusted according to the performance of the model. We provide both generalization and regret analysis to justify the superiority of our approach. Experimental results also validate its efficacy on both synthetic and real-world datasets. Keywords Concept drift · Model reuse · Non-stationary environments

1 Introduction With the rapid development in data collection technology, it is of great importance to analyze and extract knowledge from a vast number of data. However, data are commonly in a streaming form and are usually collected from non-stationary environments, and thus they are evolving in nature. In other words, the joint distribution between the input feature and the target label will change, which is also referred to as concept drift in the literature (Gama et al. 2014). If we simply ignore the distribution change when learning from the evolving data stream, the performance will dramatically drop, which is not empirically and theoretically desirable for these tasks. Consequently, the concept drift problem has become one of the most challenging issues for data stream learning and has drawn researchers’ attention to design practically effective and theoretically sound algorithms.

Editors: Kee-Eung Kim and Jun Zhu.

B

Zhi-Hua Zhou [email protected] Peng Zhao [email protected] Le-Wen Cai [email protected]

1

National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China

123

Machine Learning

However, data stream with concept drift is essentially almost impossible to learn (predict) if there is no assumption on the distribution change. That is, if the underlying distribution changes arbitrarily or even adversarially, there is no hope to learn a good model to make the prediction. We share the same assumption with most of previous works, that is, there contains some useful knowledge for the future prediction in previous data. No matter sliding window based approaches (Klinkenberg and Joachims 2000; Bifet and Gavaldà 2007; Kuncheva and Zliobaite 2009), forgetting based strategies (Koychev 2000; Klinkenberg 2004; Zhao at al. 2019) or ensemble based methods (Kolter and Maloof 2005, 2007; Sun et al. 2018), they share the same assumption, whereas the difference is how to exploit and utilize the knowledge in previous data. Another issu