Bound smoothing based time series anomaly detection using multiple similarity measures

  • PDF / 4,613,225 Bytes
  • 17 Pages / 595.276 x 790.866 pts Page_size
  • 94 Downloads / 231 Views

DOWNLOAD

REPORT


Bound smoothing based time series anomaly detection using multiple similarity measures Wenqing Wang1 · Junpeng Bao1   · Tao Li1 Received: 8 November 2019 / Accepted: 28 April 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Time series data is pervasive in many applications and the anomaly detection about it is important, which will provide the early warning of some unexpected patterns. In this paper, we propose a multiple similarity based anomalous subsequences detection method, which is unsupervised and domain knowledge free. Firstly, to improve the time efficiency, an anomaly candidates selection scheme is introduced based on the locality sensitive hashing (LSH), which considers a subsequence that does not collide with the others as a potential anomaly. However, if the raw time series is noisy and the anomaly is subtle, the performance of LSH will be degraded. In order to address this problem, we present a smoothing method to remove the noise and highlight the anomalous part in a time series, which can help to decrease the collision probability between an anomaly and the other subsequences. Secondly, we employ Pareto analysis to incorporate multiple similarity measures since there are different types of anomalies in real applications. It is unlikely that a single similarity measure can perform consistently well on different types of anomalies. Thirdly a new anomaly score scheme is provided to evaluate each anomaly candidate, which is based on the number of non-dominated vectors. Finally, we conduct extensive experiments on benchmark datasets from diverse domains and compare our method with the state-of-the-art approaches. The results show that our method can reach higher accuracy. Keywords  Anomaly detection · Time series · Bound smoothing · Multiple similarity measures

Introduction Anomaly detection refers to the problem of finding patterns in data that standout as being dissimilar to all the others (Chalapathy and Chawla 2019). Time series is pervasive in real world and the anomaly detection about it has been studied for decades due to its wide applications, such as the detection of network intrusion, commercial fraud, medical Electronic supplementary material  The online version of this article (https​://doi.org/10.1007/s1084​5-020-01583​-0) contains supplementary material, which is available to authorized users. * Junpeng Bao [email protected] Wenqing Wang [email protected] Tao Li [email protected] 1



Department of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, People’s Republic of China

and public health, satellite monitoring, industry damage, traffic congestion (Chandola et al. 2009) and so on. There are many literatures about time series anomaly detection approaches. One way is prediction based methods (Appice et al. 2014; Laptev et al. 2015; Malhotra et al. 2015), which employ prediction techniques to train the past data and then use the trained model to predict the value of subsequent time steps. And the anomaly score of each time s

Data Loading...