Semi-Supervised Consensus Clustering for ECG Pathology Classification

Pervasive technology is changing the paradigm of healthcare, by empowering users and families with the means for self-care and general health management. However, this requires accurate algorithms for information processing and pathology detection. Accord

  • PDF / 1,110,646 Bytes
  • 15 Pages / 439.37 x 666.142 pts Page_size
  • 22 Downloads / 188 Views

DOWNLOAD

REPORT


2

Instituto de Telecomunica¸co ˜es, Instituto Superior T´ecnico, Universidade de Lisboa, Lisbon, Portugal [email protected] Instituto Superior de Engenharia de Lisboa, Lisbon, Portugal 3 FBK-irst, Trento, Italy 4 CardioID Technologies, Lisbon, Portugal

Abstract. Pervasive technology is changing the paradigm of healthcare, by empowering users and families with the means for self-care and general health management. However, this requires accurate algorithms for information processing and pathology detection. Accordingly, this paper presents a system for electrocardiography (ECG) pathology classification, relying on a novel semi-supervised consensus clustering algorithm, which finds a consensus partition among a set of baseline clusterings that have been collected for the data under consideration. In contrast to typical unsupervised scenarios, our solution allows exploiting partial prior knowledge of a subset of data points. Our method is built upon the evidence accumulation framework to efficaciously sidestep the cluster correspondence problem. Computationally, the consensus partition is sought by exploiting a result known as Baum-Eagon inequality in the probability domain, which allows for a step-size-free optimization. Experiments on standard benchmark datasets show the validity of our method over the state-of-the-art. In the real world problem of ECG pathology classification, the proposed method achieves comparable performance to supervised learning methods using as few as 20% labeled data points. Keywords: Electrocardiography · ECG · Semi-supervised learning Consensus clustering · Evidence accumulation clustering

1

·

Introduction

Heart disease, or more formally cardiovascular disease (CVD), is the first cause of death worldwide. An estimated 17.3 million people died from CVD in 2008, representing 30% of all global deaths. Among these deaths, an estimated 7.3 million were due to coronary heart disease and 6.2 million were due to stroke. In the US, about 0.6 million people die from heart disease every year (25% of the deaths). c Springer International Publishing Switzerland 2015  A. Bifet et al. (Eds.): ECML PKDD 2015, Part III, LNAI 9286, pp. 150–164, 2015. DOI: 10.1007/978-3-319-23461-8 10

Semi-Supervised Consensus Clustering for ECG Pathology Classification

151

These statistics trigger our work, which provides a semi-supervised Electrocardiography (ECG) pathology classification system that tries to mitigate the aforementioned serious threats. The system builds upon the pervasive healthcare framework, where devices are becoming more handy, user-friendly and comfortable for the user, focusing on usability and allowing continuous (or quasicontinuous) monitoring of biosignals. The aim is to automatically classify ECG data streams acquired by monitoring devices, giving alerts of abnormal situations. The use of the semi-supervised learning paradigm is motivated by the existence of prior knowledge about classes in this domain, namely pathologies, which can be gathered from annotated records of some patients, but a large