A Sparse Bayesian Learning Algorithm for Longitudinal Image Data

Longitudinal imaging studies, where serial (multiple) scans are collected on each individual, are becoming increasingly widespread. The field of machine learning has in general neglected the longitudinal design, since many algorithms are built on the assu

  • PDF / 357,626 Bytes
  • 8 Pages / 439.363 x 666.131 pts Page_size
  • 62 Downloads / 201 Views

DOWNLOAD

REPORT


Abstract. Longitudinal imaging studies, where serial (multiple) scans are collected on each individual, are becoming increasingly widespread. The field of machine learning has in general neglected the longitudinal design, since many algorithms are built on the assumption that each datapoint is an independent sample. Thus, the application of general purpose machine learning tools to longitudinal image data can be sub-optimal. Here, we present a novel machine learning algorithm designed to handle longitudinal image datasets. Our approach builds on a sparse Bayesian image-based prediction algorithm. Our empirical results demonstrate that the proposed method can offer a significant boost in prediction performance with longitudinal clinical data. Keywords: Machine learning, Image-based prediction, Longitudinal data.

1

Introduction

Machine learning algorithms are increasingly applied to biomedical image data for a range of clinical applications, including computer aided detection/diagnosis (CAD) and studying group differences, e.g. [1,2,3] . In early biomedical applications, off-the-shelf algorithms such as Support Vector Machines were employed on image intensity data. However, there has been a recent proliferation of customized methods that derive optimal image features and incorporate domain knowledge about the clinical context and imaging data, e.g. [4,5,6,7,8]. Such customized methods can offer a significant increase in prediction accuracy. Machine learning in general, and its application to population-level biomedical image analysis in particular, has largely been concerned with the cross-sectional design, where each sample is treated as independent. Yet, as data acquisition costs continue to fall and data collection efforts become more collaborative and standardized, longitudinal designs have become increasingly widespread. Longitudinal studies, where serial data are collected on each individual, can offer increased sensitivity and specificity in detecting associations, and provide insights into the temporal dynamics of underlying biological processes. Real-life longitudinal data suffer from several technical issues, which make their analysis challenging. Subject drop-outs, missing visits, variable number of 

Supported by NIH NIBIB 1K25EB013649-01 and a BrightFocus grant (AHAFA2012333). Data used were obtained from ADNI: http://tinyurl.com/ADNI-main.

c Springer International Publishing Switzerland 2015  N. Navab et al. (Eds.): MICCAI 2015, Part III, LNCS 9351, pp. 411–418, 2015. DOI: 10.1007/978-3-319-24574-4_49

412

M.R. Sabuncu

Table 1. Data from annual ADNI MRI visits analyzed in this study. Note some subjects had MRI visits at 6, 18, 30, 42, 54, and 66 months too. Planned visit time (months) Baseline 12 24 36 48 60 72 Mean± Std. time (months) 0 13.1 ± .8 25.5 ± 1.2 37.7 ± 1.2 50.7 ± 2.2 62.4 ± 1.7 74.2 ± 2.0 Number of imaging sessions 791 649 518 336 216 159 131

visits, and heterogeneity in the timing of visits are commonplace. For example, Table 1 illustrates these challenges with longitudinal data from the Al