Delayed labelling evaluation for data streams

  • PDF / 867,085 Bytes
  • 30 Pages / 439.37 x 666.142 pts Page_size
  • 16 Downloads / 213 Views

DOWNLOAD

REPORT


Delayed labelling evaluation for data streams Maciej Grzenda1

· Heitor Murilo Gomes2,3

· Albert Bifet2,3

Received: 20 November 2018 / Accepted: 5 September 2019 © The Author(s) 2019

Abstract A large portion of the stream mining studies on classification rely on the availability of true labels immediately after making predictions. This approach is well exemplified by the test-then-train evaluation, where predictions immediately precede true label arrival. However, in many real scenarios, labels arrive with non-negligible latency. This raises the question of how to evaluate classifiers trained in such circumstances. This question is of particular importance when stream mining models are expected to refine their predictions between acquiring instance data and receiving its true label. In this work, we propose a novel evaluation methodology for data streams when verification latency takes place, namely continuous re-evaluation. It is applied to reference data streams and it is used to differentiate between stream mining techniques in terms of their ability to refine predictions based on newly arriving instances. Our study points out, discusses and shows empirically the importance of considering the delay of instance labels when evaluating classifiers for data streams. Keywords Stream mining · Delayed labels · Evaluation procedures · Classification

Responsible editor: Po-Ling Loh, Evimaria Terzi, Antti Ukkonen, and Karsten Borgwardt.

B

Maciej Grzenda [email protected] Heitor Murilo Gomes [email protected] Albert Bifet [email protected]

1

Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland

2

LTCI, Télécom Paris, IP-Paris, Paris, France

3

Department of Computer Science, University of Waikato, Hamilton, New Zealand

123

M. Grzenda et al.

1 Introduction The evaluation of stream mining algorithms in many cases relies on the test-then-train or prequential approach (Gama and Rodrigues 2009), i.e., first an unlabelled instance is used to generate model prediction; next, the model is provided with the label of the instance to trigger possible model updates. This approach, while clear and uniform over all stream instances, is not applicable when delayed labels are observed. In this case, the time period between using the unlabelled instance data as a model input and receiving the true label is non-negligible. Since prequential evaluation does not match all stream mining settings, verification latency (Ditzler et al. 2015) and stream mining methods focusing the delayed label setting have been investigated (Kuncheva and Sánchez 2008; Masud et al. 2011; Souza et al. 2015). Many of these works assume that the prediction for an instance of interest is performed when the instance data arrives for the first time. However, in problems such as delay prediction in transportation systems an extension of this approach can be proposed aiming at iterative re-consideration of predictions already made. As an example, when the objective is to predict whether a pl