A Sparse Bayesian Learning Algorithm for Longitudinal Image Data
Longitudinal imaging studies, where serial (multiple) scans are collected on each individual, are becoming increasingly widespread. The field of machine learning has in general neglected the longitudinal design, since many algorithms are built on the assu
- PDF / 357,626 Bytes
- 8 Pages / 439.363 x 666.131 pts Page_size
- 62 Downloads / 201 Views
Abstract. Longitudinal imaging studies, where serial (multiple) scans are collected on each individual, are becoming increasingly widespread. The field of machine learning has in general neglected the longitudinal design, since many algorithms are built on the assumption that each datapoint is an independent sample. Thus, the application of general purpose machine learning tools to longitudinal image data can be sub-optimal. Here, we present a novel machine learning algorithm designed to handle longitudinal image datasets. Our approach builds on a sparse Bayesian image-based prediction algorithm. Our empirical results demonstrate that the proposed method can offer a significant boost in prediction performance with longitudinal clinical data. Keywords: Machine learning, Image-based prediction, Longitudinal data.
1
Introduction
Machine learning algorithms are increasingly applied to biomedical image data for a range of clinical applications, including computer aided detection/diagnosis (CAD) and studying group differences, e.g. [1,2,3] . In early biomedical applications, off-the-shelf algorithms such as Support Vector Machines were employed on image intensity data. However, there has been a recent proliferation of customized methods that derive optimal image features and incorporate domain knowledge about the clinical context and imaging data, e.g. [4,5,6,7,8]. Such customized methods can offer a significant increase in prediction accuracy. Machine learning in general, and its application to population-level biomedical image analysis in particular, has largely been concerned with the cross-sectional design, where each sample is treated as independent. Yet, as data acquisition costs continue to fall and data collection efforts become more collaborative and standardized, longitudinal designs have become increasingly widespread. Longitudinal studies, where serial data are collected on each individual, can offer increased sensitivity and specificity in detecting associations, and provide insights into the temporal dynamics of underlying biological processes. Real-life longitudinal data suffer from several technical issues, which make their analysis challenging. Subject drop-outs, missing visits, variable number of
Supported by NIH NIBIB 1K25EB013649-01 and a BrightFocus grant (AHAFA2012333). Data used were obtained from ADNI: http://tinyurl.com/ADNI-main.
c Springer International Publishing Switzerland 2015 N. Navab et al. (Eds.): MICCAI 2015, Part III, LNCS 9351, pp. 411–418, 2015. DOI: 10.1007/978-3-319-24574-4_49
412
M.R. Sabuncu
Table 1. Data from annual ADNI MRI visits analyzed in this study. Note some subjects had MRI visits at 6, 18, 30, 42, 54, and 66 months too. Planned visit time (months) Baseline 12 24 36 48 60 72 Mean± Std. time (months) 0 13.1 ± .8 25.5 ± 1.2 37.7 ± 1.2 50.7 ± 2.2 62.4 ± 1.7 74.2 ± 2.0 Number of imaging sessions 791 649 518 336 216 159 131
visits, and heterogeneity in the timing of visits are commonplace. For example, Table 1 illustrates these challenges with longitudinal data from the Al
Data Loading...