Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia

  • PDF / 3,163,222 Bytes
  • 18 Pages / 595.224 x 790.955 pts Page_size
  • 20 Downloads / 194 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia Jussi Tohka1,2 · Elaheh Moradi3 · Heikki Huttunen3 · Alzheimer’s Disease Neuroimaging Initiative

© Springer Science+Business Media New York 2016

Abstract We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the outof-training sample classification accuracy and the set of Alzheimer’s Disease Neuroimaging Initiative (ADNI) is a Group/Institutional Author Data used in preparation of this article were obtained from the Alzheimers Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at adni.loni.usc.edu/wp-content/uploads/how to apply/ ADNI Acknowledgement List.pdf Electronic supplementary material The online version of this article (doi:10.1007/s12021-015-9292-3) contains supplementary material, which is available to authorized users.

selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer’s disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy. Keywords Magnetic Resonance Imaging · Machine Learning · Feature selection · Alzheimer’s Disease · Classification · Multivariate pattern analysis

 Jussi Tohka

[email protected] 1

Department of Bioengineering and Aerospace Engineering, Universidad Carlos III de Madrid, Avd. de la Universidad, 30, 28911, Leganes, Spain

2

Instituto de Investigaci´on