Extracting diverse-shapelets for early classification on time series

  • PDF / 1,650,212 Bytes
  • 27 Pages / 439.642 x 666.49 pts Page_size
  • 82 Downloads / 187 Views

DOWNLOAD

REPORT


Extracting diverse-shapelets for early classification on time series Wenhe Yan1 · Guiling Li1,2

· Zongda Wu3 · Senzhang Wang4 · Philip S. Yu5

Received: 1 February 2019 / Revised: 26 December 2019 / Accepted: 22 April 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract In recent years, early classification on time series has become increasingly important in time-sensitive applications. Existing shapelet based methods still cannot work well on this problem. First, the effectiveness of traditional shapelet based methods would be influenced by the number of shapelet candidates. Second, it is difficult for previous methods to obtain diverse shapelets in shapelet selection. In this paper, we propose an Improved Early Distinctive Shapelet Classification method named IEDSC. We first present a new method to more precisely measure the similarity between time series, which takes into account of the relative trend of time series. Second, in shapelet extraction, we propose a pruning technique to reduce the number of shapelets by predicting the starting positions of shapelets with good quality. In addition, a new shapelet selection method is also proposed to remove the similar shapelets, so as to maintain the diversity of shapelets. Finally, the experimental results on 16 benchmark datasets show that the proposed method outperforms state-of-the-art for early classification on time series. Keywords Time series · Early classification · Shapelet

1 Introduction Time series is a sequence of data changing with time order, so it is high-dimensional, consecutive and infinitely increasing. In recent years, time series classification has attracted rising research interest, and has been widely used in many application domains, such as medical  Guiling Li

[email protected] 1

School of Computer Science, China University of Geosciences, Wuhan, China

2

Hubei Key Laboratory of Intelligent Geo-Information Processing, China University of Geosciences, Wuhan, China

3

Department of Computer Science and Engineering, Shaoxing University, Shaoxing, China

4

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, Jiangsu, China

5

Department of Computer Science, University of Illinois at Chicago, Chicago, IL, USA

World Wide Web

diagnosis [19, 27, 31], disaster prediction [38], industrial production control [39], financial market [32] and community discovery [7]. At present, time series classification has many new extensions, of which early classification on time series data is becoming increasingly important. Early classification on time series can be used in many time-sensitive fields, including but not limited to video surveillance, intrusion detection, earthquake warning, early diagnosis, and action recognition [23, 28]. For example, in the early diagnosis of heart disease, abnormal ECG signals may indicate a specific heart disease that needs immediate treatment. Early diagnosis is critical in applications such as intensive care. If a classification mode