TEASER: early and accurate time series classification

  • PDF / 2,986,871 Bytes
  • 27 Pages / 439.37 x 666.142 pts Page_size
  • 12 Downloads / 231 Views

DOWNLOAD

REPORT


TEASER: early and accurate time series classification Patrick Schäfer1

· Ulf Leser1

Received: 5 September 2019 / Accepted: 14 May 2020 © The Author(s) 2020

Abstract Early time series classification (eTSC) is the problem of classifying a time series after as few measurements as possible with the highest possible accuracy. The most critical issue of any eTSC method is to decide when enough data of a time series has been seen to take a decision: Waiting for more data points usually makes the classification problem easier but delays the time in which a classification is made; in contrast, earlier classification has to cope with less input data, often leading to inferior accuracy. The state-of-the-art eTSC methods compute a fixed optimal decision time assuming that every times series has the same defined start time (like turning on a machine). However, in many real-life applications measurements start at arbitrary times (like measuring heartbeats of a patient), implying that the best time for taking a decision varies widely between time series. We present TEASER, a novel algorithm that models eTSC as a two-tier classification problem: In the first tier, a classifier periodically assesses the incoming time series to compute class probabilities. However, these class probabilities are only used as output label if a second-tier classifier decides that the predicted label is reliable enough, which can happen after a different number of measurements. In an evaluation using 45 benchmark datasets, TEASER is two to three times earlier at predictions than its competitors while reaching the same or an even higher classification accuracy. We further show TEASER’s superior performance using real-life use cases, namely energy monitoring, and gait detection. Keywords Time series · Early classification · Accurate · Framework

Responsible editors: Ira Assent, Carlotta Domeniconi, Aristides Gionis, Eyke Hüllermeier

B

Patrick Schäfer [email protected] Ulf Leser [email protected]

1

Humboldt University of Berlin, Berlin, Germany

123

P. Schäfer, U. Leser

1 Introduction A time series (TS) is a collection of values sequentially ordered in time. One strong force behind their rising importance is the increasing use of sensors for automatic and high resolution monitoring in domains like smart homes (Jerzak and Ziekow 2014), starlight observations (Protopapas et al. 2006), machine surveillance (Mutschler et al. 2013), or smart grids (Hobbs et al. 1999; Lew and Milligan 2016). Time series classification (TSC) is the problem of assigning one of a predefined class to a time series, like recognizing the electronic device producing a certain temporal pattern of energy consumption (Gao et al. 2014; Gisler et al. 2013) or classifying a signal of earth motions as either an earthquake or a passing lorry (Perol et al. 2018). Conventional TSC works on time series of a given, fixed length and assumes access to the entire input at classification time. In contrast, early time series classification (eTSC), which we study in thi