Efficient matching of very complex time series
- PDF / 1,196,083 Bytes
- 14 Pages / 595.276 x 790.866 pts Page_size
- 56 Downloads / 223 Views
ORIGINAL ARTICLE
Efficient matching of very complex time series Bachir Boucheham
Received: 11 May 2011 / Accepted: 15 June 2012 Springer-Verlag 2012
Abstract We propose a novel method (FANSEA) that performs very complex time series matching. The matching here includes comparison and alignment of time series, for diverse needs: diagnosis, clustering, retrieval, mining, etc. The complexity stands in the fact that the method is able to match quasi-periodic time series, that are eventually phase shifted, of different lengths, composed of different number of periods, characterized by local morphological changes and that might be shifted/scaled on the time/magnitude axis. This is the most complex case that can occur in time series matching. The efficiency stands in the fact that the newly developed FANSEA method produces alignments that are comparable to those of the previously published SEA method. However and as a result of data reduction, FANSEA consumes much less time and data; hence, allowing for faster matching and lower storage space. Basically, FANSEA is composed of two main steps: Data reduction by curve simplification of the time series traces and matching through exchange of extracted signatures between the time series under process. Due to the quasiperiodic nature of the electrocardiogram (ECG), the tests were conducted on records selected from the Massachusetts Institute of Technology-Beth Israel Hospital database (MIT-BIH). Numerically, the new method data reduction was up to 80 % and the time reduction was up to 95 %. Accordingly and among many possible applications, the new method is very suitable for searching, querying and mining of large time series databases.
B. Boucheham (&) Department of Informatics, University of Skikda, BP 26, Route El-Hadaek, 21000 Skikda, Algeria e-mail: [email protected]
Keywords Pattern matching Machine learning Time series alignment Data reduction Data mining Data retrieval
1 Introduction Modeling, analysis and exploration of time series are important applications in many fields of science and technology. They are particularly useful in knowledge discovery, machine learning and in diagnosis of systems generating these time series. Domains for such applications include economy, e.g., financial data [1], physiology, e.g., [2–4], data retrieval by content, e.g., music retrieval by humming [5–7] and fault/anomaly/novelty detection in industrial systems [8–11]. One basic operation that many time series analysis and exploration systems use is comparison of two given time series based on their shapes. That is, given two time series, the comparison operation consists in establishing a way to tell whether their traces are similar enough. One of the two time series stands in general for the reference (known behavior), whereas the second for the target (the unknown behavior). Some typical examples are illustrated in Fig. 1I–III, where the reference-target time series are (a, b), (c, d) and (e, f). Many techniques have been developed for time series comparison. Yet, the
Data Loading...