Marginal Space Deep Learning: Efficient Architecture for Detection in Volumetric Image Data

Current state-of-the-art techniques for fast and robust parsing of volumetric medical image data exploit large annotated image databases and are typically based on machine learning methods. Two main challenges to be solved are the low efficiency in scanni

  • PDF / 796,293 Bytes
  • 9 Pages / 439.363 x 666.131 pts Page_size
  • 13 Downloads / 166 Views

DOWNLOAD

REPORT


Imaging and Computer Vision, Siemens Corporate Technology, Princeton NJ, USA 2 Pattern Recognition Lab, Friedrich-Alexander-Universit¨ at, Erlangen-N¨ urnberg Abstract. Current state-of-the-art techniques for fast and robust parsing of volumetric medical image data exploit large annotated image databases and are typically based on machine learning methods. Two main challenges to be solved are the low efficiency in scanning large volumetric input images and the need for manual engineering of image features. This work proposes Marginal Space Deep Learning (MSDL) as an effective solution, that combines the strengths of efficient object parametrization in hierarchical marginal spaces with the automated feature design of Deep Learning (DL) network architectures. Representation learning through DL automatically identifies, disentangles and learns explanatory factors directly from low-level image data. However, the direct application of DL to volumetric data results in a very high complexity, due to the increased number of transformation parameters. For example, the number of parameters defining a similarity transformation increases to 9 in 3D (3 for location, 3 for orientation and 3 for scale). The mechanism of marginal space learning provides excellent run-time performance by learning classifiers in high probability regions in spaces of gradually increasing dimensionality, for example starting from location only (3D) to location and orientation (6D) and full parameter space (9D). In addition, for parametrized feature computation, we propose to simplify the network by replacing the standard, pre-determined feature sampling pattern with a sparse, adaptive, self-learned pattern. The MSDL framework is evaluated on detecting the aortic heart valve in 3D ultrasound data. The dataset contains 3795 volumes from 150 patients. Our method outperforms the state-of-the-art with an improvement of 36%, running in less than one second. To our knowledge this is the first successful demonstration of the DL potential to detection in full 3D data with parametrized representations.

1

Introduction

Effective data representation is essential for the performance of machine learning algorithms [1]. This motivates a large effort invested into handcrafting features, which encompass the underlying observation in a learning space easy to tackle. For this purpose, complex data preprocessing and transformation pipelines are used to design representations that can ensure an effective learning process. c Springer International Publishing Switzerland 2015  N. Navab et al. (Eds.): MICCAI 2015, Part I, LNCS 9349, pp. 710–718, 2015. DOI: 10.1007/978-3-319-24553-9_87

Marginal Space Deep Learning

711

This type of approach is however subject to severe limitations, since it targets exclusively human ingenuity to disentangle and understand prior information hidden in the data and then use such knowledge for feature engineering [2, 3]. Representation learning through Deep Learning (DL) addresses these limitations and is aimed to expand the scope and general applicabili