An Energy-Based Similarity Measure for Time Series
- PDF / 941,289 Bytes
- 8 Pages / 600.05 x 792 pts Page_size
- 7 Downloads / 180 Views
Research Article An Energy-Based Similarity Measure for Time Series Abdel-Ouahab Boudraa,1, 2 Jean-Christophe Cexus,2 Mathieu Groussat,1 and Pierre Brunagel1 1 IRENav, 2 E3I2,
Ecole Navale, Lanv´eoc Poulmic, BP600, 29240 Brest-Arm´ees, France EA 3876, ENSIETA, 29806 Brest Cedex 9, France
Correspondence should be addressed to Abdel-Ouahab Boudraa, [email protected] Received 27 August 2006; Revised 30 March 2007; Accepted 24 July 2007 Recommended by Jose C. M. Bermudez A new similarity measure, called SimilB, for time series analysis, based on the cross-ΨB -energy operator (2004), is introduced. ΨB is a nonlinear measure which quantifies the interaction between two time series. Compared to Euclidean distance (ED) or the Pearson correlation coefficient (CC), SimilB includes the temporal information and relative changes of the time series using the first and second derivatives of the time series. SimilB is well suited for both nonstationary and stationary time series and particularly those presenting discontinuities. Some new properties of ΨB are presented. Particularly, we show that ΨB as similarity measure is robust to both scale and time shift. SimilB is illustrated with synthetic time series and an artificial dataset and compared to the CC and the ED measures. Copyright © 2008 Abdel-Ouahab Boudraa et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1.
INTRODUCTION
A Time Series (TS) is a sequence of real numbers where each one represents the value of an attribute of interest (stock or commodity price, sale, exchange, weather data, biomedical measurement, etc.). TS datasets are common in various fields such as in medicine, finance, and multimedia. For example, in gesture recognition and video sequence matching using computer vision, several features are extracted from each image continuously, which renders them TSs [2]. Typical applications on TSs deal with tasks like classification, clustering, similarity search, prediction, and forecasting. These applications rely heavily on the ability to measure the similarity or dissimilarity between TSs [3]. Defining the similarity of TSs or objects is crucial in any data analysis and decision making process. The simplest approach typically used to define a similarity function is based on the Euclidean distance (ED) or some extensions to support various transformations such as scaling or shifting. The ED may fail to produce a correct similarity measure between TSs because it cannot deal with outliers and it is very sensitive to small distortions in the time axis [4]. The Pearson correlation coefficient (CC) is a popular measure to compare TSs. Yet, the CC is not necessarily coherent with the shape and it does not consider the order of time points and uneven sampling intervals. Furthermore,
similarity measures using the ED or the CC do not include temporal information and the relative changes of the TSs.
Data Loading...