Learning the Clustering of Longitudinal Shape Data Sets into a Mixture of Independent or Branching Trajectories
- PDF / 2,306,985 Bytes
- 16 Pages / 595.276 x 790.866 pts Page_size
- 17 Downloads / 178 Views
Learning the Clustering of Longitudinal Shape Data Sets into a Mixture of Independent or Branching Trajectories Vianney Debavelaere1 · Stanley Durrleman2 · Stéphanie Allassonnière3 · for the Alzheimer’s Disease Neuroimaging Initiative Received: 10 September 2019 / Accepted: 9 May 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Given repeated observations of several subjects over time, i.e. a longitudinal data set, this paper introduces a new model to learn a classification of the shapes progression in an unsupervised setting: we automatically cluster a longitudinal data set in different classes without labels. Our method learns for each cluster an average shape trajectory (or representative curve) and its variance in space and time. Representative trajectories are built as the combination of pieces of curves. This mixture model is flexible enough to handle independent trajectories for each cluster as well as fork and merge scenarios. The estimation of such non linear mixture models in high dimension is known to be difficult because of the trapping states effect that hampers the optimisation of cluster assignments during training. We address this issue by using a tempered version of the stochastic EM algorithm. Finally, we apply our algorithm on different data sets. First, synthetic data are used to show that a tempered scheme achieves better convergence. We then apply our method to different real data sets: 1D RECIST score used to monitor tumors growth, 3D facial expressions and meshes of the hippocampus. In particular, we show how the method can be used to test different scenarios of hippocampus atrophy in ageing by using an heteregenous population of normal ageing individuals and mild cognitive impaired subjects. Keywords Longitudinal data analysis · Mixture model · Branching population · Stochastic optimization · Statistical model · Riemannian manifold
1 Introduction The emergence of large longitudinal data sets (subjects observed repeatedly at different time points) has allowed Communicated by B. C. Vemuri. Data used in preparation of this article were obtained from the Alzheimers Disease Neuroimaging Initiative (ADNI) database. As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: https://adni.loni.usc.edu.
B
Vianney Debavelaere [email protected]
1
Centre de Mathématiques Appliquées, École polytechnique, Palaiseau, France
2
ARAMIS Lab, Institut du Cerveau et de la Moelle épinière, 47 Boulevard de l’Hôpital, Paris, France
3
Centre de Recherche des Cordeliers, Université Paris Descartes, Paris, France
the construction of different models improving the understanding of biological or natural phenomenon. Longitudinal studies have numerous applications: understating of the differences of progression in neurodegenerative disease such as Alzheimer’s, chemotherapy monitorin
Data Loading...