Early Classification of Time Series as a Non Myopic Sequential Decision Making Problem

Classification of time series as early as possible is a valuable goal. Indeed, in many application domains, the earliest the decision, the more rewarding it can be. Yet, often, gathering more information allows one to get a better decision. The optimizati

  • PDF / 536,131 Bytes
  • 15 Pages / 439.37 x 666.142 pts Page_size
  • 82 Downloads / 239 Views

DOWNLOAD

REPORT


2

EDF R & D, 1, Avenue du G´en´eral de Gaulle, 92141 Clamart Cedex, France D´epartement MMIP Et INRA UMR-518, AgroParisTech, 16, Rue Claude Bernard, 75231 Paris Cedex 5, France [email protected] http://www.agroparistech.fr/mia/equipes:membres:page:antoine

Abstract. Classification of time series as early as possible is a valuable goal. Indeed, in many application domains, the earliest the decision, the more rewarding it can be. Yet, often, gathering more information allows one to get a better decision. The optimization of this time vs. accuracy tradeoff must generally be solved online and is a complex problem. This paper presents a formal criterion that expresses this trade-off in all generality together with a generic sequential meta algorithm to solve it. This meta algorithm is interesting in two ways. First, it pinpoints where choices can (have to) be made to obtain a computable algorithm. As a result a wealth of algorithmic solutions can be found. Second, it seeks online the earliest time in the future where a minimization of the criterion can be expected. It thus goes beyond the classical approaches that myopically decide at each time step whether to make a decision or to postpone the call one more time step. After this general setting has been expounded, we study one simple declination of the meta-algorithm, and we show the results obtained on synthetic and real time series data sets chosen for their ability to test the robustness and properties of the technique. The general approach is vindicated by the experimental results, which allows us to point to promising perspectives.

Keywords: Early classification of time series making

1

·

Sequential decision

Introduction

In many applications, it is natural to acquire the description of an object incrementally, with new measurements arriving sequentially. This is the case in medicine, when a patient undergoes successive examinations until it is determined that enough evidence has been acquired to decide with sufficient certainty the disease he/she is suffering from. Sometimes, the measurements are not controlled and just arrive over time, as when the behavior of a consumer on a web site is monitored on-line in order to predict what add to put on his/her screen. c Springer International Publishing Switzerland 2015  A. Appice et al. (Eds.): ECML PKDD 2015, Part I, LNAI 9284, pp. 433–447, 2015. DOI: 10.1007/978-3-319-23528-8 27

434

A. Dachraoui et al.

In these situations, one is interested in making a prediction as soon as possible, either because each measurement is costly or because it is critical to act as early as possible in order to yield higher returns. However, this generally induces a tradeoff as less measurements commonly entail more prediction errors that can be expensive. The question is therefore how to decide on-line that now is the optimal time to make a prediction. The problem of deciding when enough information has been gathered to make a reliable decision has historically been studied under the name of sequential decision making o