Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns i

  • PDF / 2,331,062 Bytes
  • 16 Pages / 600.03 x 792 pts Page_size
  • 24 Downloads / 192 Views

DOWNLOAD

REPORT


Research Article Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge Zeeshan Syed,1 John Guttag,1 and Collin Stultz1, 2 1 Massachusetts 2 Brigham

Institute of Technology, Cambridge, MA 02139-4307, USA and Women’s Hospital, Cambridge, MA 02115, USA

Received 30 April 2006; Revised 18 December 2006; Accepted 27 December 2006 Recommended by Maurice Cohen This paper describes novel fully automated techniques for analyzing large amounts of cardiovascular data. In contrast to traditional medical expert systems our techniques incorporate no a priori knowledge about disease states. This facilitates the discovery of unexpected events. We start by transforming continuous waveform signals into symbolic strings derived directly from the data. Morphological features are used to partition heart beats into clusters by maximizing the dynamic time-warped sequence-aligned separation of clusters. Each cluster is assigned a symbol, and the original signal is replaced by the corresponding sequence of symbols. The symbolization process allows us to shift from the analysis of raw signals to the analysis of sequences of symbols. This discrete representation reduces the amount of data by several orders of magnitude, making the search space for discovering interesting activity more manageable. We describe techniques that operate in this symbolic domain to discover rhythms, transient patterns, abnormal changes in entropy, and clinically significant relationships among multiple streams of physiological data. We tested our techniques on cardiologist-annotated ECG data from forty-eight patients. Our process for labeling heart beats produced results that were consistent with the cardiologist supplied labels 98.6% of the time, and often provided relevant finer-grained distinctions. Our higher level analysis techniques proved effective at identifying clinically relevant activity not only from symbolized ECG streams, but also from multimodal data obtained by symbolizing ECG and other physiological data streams. Using no prior knowledge, our analysis techniques uncovered examples of ventricular bigeminy and trigeminy, ectopic atrial rhythms with aberrant ventricular conduction, paroxysmal atrial tachyarrhythmias, atrial fibrillation, and pulsus paradoxus. Copyright © 2007 Zeeshan Syed et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1.

INTRODUCTION

The increasing prevalence of long-term monitoring in both ICU and ambulatory settings will yield ever increasing amounts of physiological data. The sheer volume of information that is generated about an individual patient poses a serious challenge to healthcare professionals. Patients in an ICU setting, for example, often have continuous streams of data arising from telemetry monitors, pulse oximeters, SwanGanz catheter