SMILE : a feature-based temporal abstraction framework for event-interval sequence classification
- PDF / 988,974 Bytes
- 28 Pages / 439.37 x 666.142 pts Page_size
- 85 Downloads / 180 Views
SMILE: a feature-based temporal abstraction framework for event-interval sequence classification Jonathan Rebane1 · Isak Karlsson1 · Leon Bornemann2 · Panagiotis Papapetrou1 Received: 2 July 2019 / Accepted: 20 October 2020 © The Author(s) 2020
Abstract In this paper, we study the problem of classification of sequences of temporal intervals. Our main contribution is a novel framework, which we call SMILE, for extracting relevant features from interval sequences to construct classifiers. SMILE introduces the notion of utilizing random temporal abstraction features, we define as e-lets, as a means to capture information pertaining to class-discriminatory events which occur across the span of complete interval sequences. Our empirical evaluation is applied to a wide array of benchmark data sets and fourteen novel datasets for adverse drug event detection. We demonstrate how the introduction of simple sequential features, followed by progressively more complex features each improve classification performance. Importantly, this investigation demonstrates that SMILE significantly improves AUC performance over the current state-of-the-art. The investigation also reveals that the selection of underlying classification algorithm is important to achieve superior predictive performance, and how the number of features influences the performance of our framework. Keywords Event interval sequences · Temporal intervals · Temporal abstractions · Classification
Responsible editor: Toon Calders.
B
Jonathan Rebane [email protected] Isak Karlsson [email protected] Leon Bornemann [email protected] Panagiotis Papapetrou [email protected]
1
Stockholm University, Stockholm, Sweden
2
Hasso Plattner Institute for Software Systems Engineering, Potsdam, Germany
123
J. Rebane et al.
1 Introduction Sequences of temporal intervals are defined as ordered sets of events occurring over time, with each event having a time duration, which may co-occur with other events. As a result, several temporal relations are possible between pairs of events, such as one event overlapping another event or two events starting concurrently with one ending before the other. Such sequences, also known as e-sequences, can be found in a variety of application domains, including sign language transcription (Papapetrou et al. 2009), human activity recognition and monitoring (Uddin and Uddiny 2015), music classification (Pachet et al. 1996), and predicting clinical outcomes from medical records (Kosara and Miksch 2001; Moskovitch and Shahar 2015a). An example of an e-sequence, taken from the healthcare domain, is depicted in Fig. 1. The example e-sequence contains six events describing an Adverse Drug Reaction (ADR) caused by the use of the medication “procainamide” on a patient suffering from arrhythmia. We observe that the patient shown in the example underwent an episode of arrhythmia (first event) before being hospitalized (second event) and administered with procainamide (third event). A second episode of arrhythmia occurred shortly after (fourth
Data Loading...