Why Does Synthesized Data Improve Multi-sequence Classification?

The classification and registration of incomplete multi-modal medical images, such as multi-sequence MRI with missing sequences, can sometimes be improved by replacing the missing modalities with synthetic data. This may seem counter-intuitive: synthetic

PDF / 156,090 Bytes
8 Pages / 439.363 x 666.131 pts Page_size
7 Downloads / 201 Views

DOWNLOAD

REPORT

Biomedical Imaging Group Rotterdam Erasmus MC University Medical Center, The Netherlands 2 Department of Computer Science University of Copenhagen, Denmark Abstract. The classiﬁcation and registration of incomplete multi-modal medical images, such as multi-sequence MRI with missing sequences, can sometimes be improved by replacing the missing modalities with synthetic data. This may seem counter-intuitive: synthetic data is derived from data that is already available, so it does not add new information. Why can it still improve performance? In this paper we discuss possible explanations. If the synthesis model is more ﬂexible than the classiﬁer, the synthesis model can provide features that the classiﬁer could not have extracted from the original data. In addition, using synthetic information to complete incomplete samples increases the size of the training set. We present experiments with two classiﬁers, linear support vector machines (SVMs) and random forests, together with two synthesis methods that can replace missing data in an image classiﬁcation problem: neural networks and restricted Boltzmann machines (RBMs). We used data from the BRATS 2013 brain tumor segmentation challenge, which includes multi-modal MRI scans with T1, T1 post-contrast, T2 and FLAIR sequences. The linear SVMs appear to beneﬁt from the complex transformations oﬀered by the synthesis models, whereas the random forests mostly beneﬁt from having more training data. Training on the hidden representation from the RBM brought the accuracy of the linear SVMs close to that of random forests.

1

Introduction

Multi-sequence data can be very informative in medical imaging, but using it may cause some practical problems. Training a classiﬁer on multi-modal data, for instance, generally requires that all modalities are available for all samples. If some modalities are missing, there is a range of methods for handling or imputing the missing values in standard statistical analysis [1]. Speciﬁcally for image analysis, there are synthesis methods that predict missing modalities. Some methods model the physical properties of the imaging process, e.g., to derive intrinsic tissue parameters from MRI scans [2] or to derive pseudo-CT from MRI in radiotherapy applications [3,4]. But an explicit model of the imaging process is not even required, as image processing techniques can be suﬃcient: for example, pseudo-CT images have also been made with tissue segmentation [5,6], with Gaussian mixture models [7] or by registering and combining CT images [8,9]. c Springer International Publishing Switzerland 2015 N. Navab et al. (Eds.): MICCAI 2015, Part I, LNCS 9349, pp. 531–538, 2015. DOI: 10.1007/978-3-319-24553-9_65

532

G. van Tulder and M. de Bruijne

Interestingly, data synthesis can not only generate images but also helps as an intermediate step. For example, Iglesias et al. [10] found that synthetic data improved the registration of multi-sequence brain MRI. Roy et al. [11] showed that synthetic sequences can improve segmentation consistency in datasets

Data Loading...

Why Does Synthesized Data Improve Multi-sequence Classification?

Recommend Documents

(Why) Are We Falling Behind and (Why) Does It Matter?

Why the Self Does Not Extend

Does Export Upgrading Improve Urban Environment?

Does the IOFix implant improve union rates?

Does Colectomy Improve Type 2 Diabetes?

Data-Driven Models of Selfish Routing: Why Price of Anarchy Does Depend on Network Topology

Fixing Localization Errors to Improve Image Classification

Why over-parameterization of deep neural networks does not overfit?

What Structures Marine Biodiversity and why does it vary?

Data Science and Classification

Why Does Revenge Challenge Conservators More Than Avaton?

Performance Bias in Synthesized Biometric Data