An Overview of Outliers and Detection Methods in General for Time Series from IoT Devices

As internet of things (IoT) devices are booming, a huge amount of data is sleeping without being used. At the same time, reliable and accurate time series analysis plays a key role in modern intelligent systems for achieving efficient management. One reas

PDF / 803,611 Bytes
7 Pages / 439.37 x 666.142 pts Page_size
94 Downloads / 190 Views

DOWNLOAD

REPORT

Abstract. As internet of things (IoT) devices are booming, a huge amount of data is sleeping without being used. At the same time, reliable and accurate time series analysis plays a key role in modern intelligent systems for achieving efﬁcient management. One reason why the data are not being used is that outliers are preventing many algorithms from working effectively. Manual data cleaning is taking the majority time before one solution could really work on data. Thus, data cleaning, especially fully automated outlier detection is the bottleneck which should be resolved as soon as possible. Previous work has investigated this topic but lacks study on overview from outlier and detection categorization aspects at the same time. This works aims to start covering this topic and to ﬁnd a direction regarding how to make outlier detection and labelling more automated and general to be suitable for most time series data from IoT devices. Keywords: Survey things

Anomaly novelty detection Time series Internet of

1 Introduction Time series analysis is widely used in intelligent transport, smart medical assistant, weather forecast, ﬁnancial systems among other time-dynamic science and engineering topics. To achieve desired results, a lot of data are needed, which should be clean and in a good quality. However, one big problem is dirty data nowadays. Monitoring and getting data are important, but before any analysis starts, we need clean data. No matter what resources we have, it is nearly always necessary to clean and label outliers in the data. Outliers in raw data are preventing algorithms to achieve their best performance. In this paper, we try to get an overview of outliers and detection methods to see how to tackle the dirty data.

2 Categorization of Outliers For data in general, outliers are commonly categorized as three general types: point, contextual and collective [1, 2]. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liu et al. (Eds.): CENet 2020, AISC 1274, pp. 1180–1186, 2021. https://doi.org/10.1007/978-981-15-8462-6_135

An Overview of Outliers and Detection Methods

1181

Point outliers are often used while analysing multi-dimensional data which are shown in Fig. 1a [3] which are also known as global outliers due to the fact that the original point-based methods are not considering the local context. Global outliers are usually detected by applying some kind of threshold.

(a)

(b)

(d)

(c)

(e)

Fig. 1. (a) Point (global) outliers are marked as triangles [3]. (b) A collective outlier in electrocardiographic signal [5]. (c) Two collective outliers due to football matches [6]. (d) An additive outlier is marked as A while a consecutive outlier is marked as B. (e) A long-time range consecutive outlier in time series.

In contract, contextual outliers are useful when observation’s context matters. For example, 80 °C is a global outlier and 30 °C is a contextual outlier in Nordic area but normal in India. Contextual outliers are al

Data Loading...

An Overview of Outliers and Detection Methods in General for Time Series from IoT Devices

Recommend Documents

Identification of Multiple Outliers in Time Series

Methods for Multivariate Time Series

Other Methods for Financial Time Series

Intelligent Methods for Predicting Financial Time Series

General Management of Cerebellar Disorders: An Overview

Nonlinear Time Series Nonparametric and Parametric Methods

General Management of Cerebellar Disorders: An Overview

Robust estimation for general integer-valued time series models

Traffic Incident Detection from Massive Multivariate Time-Series Data

Time Series Analysis Methods and Applications for Flight Data

An Empirical Study of Neural Networks for Trend Detection in Time Series

Adoption of IoT in Vehicular Traffic Control: An Overview