Uncertainty-Driven Forest Predictors for Vertebra Localization and Segmentation

Accurate localization, identification and segmentation of vertebrae is an important task in medical and biological image analysis. The prevailing approach to solve such a task is to first generate pixelindependent features for each vertebra, e.g. via a ra

  • PDF / 1,375,910 Bytes
  • 8 Pages / 439.363 x 666.131 pts Page_size
  • 51 Downloads / 174 Views

DOWNLOAD

REPORT


Max Planck Institute of Molecular Cell Biology and Genetics, Germany 2 Biomedical Image Analysis Group, Imperial College London, UK Computer Vision Lab Dresden, Technical University Dresden, Germany {richmond,kainmueller}@mpi-cbg.de

Abstract. Accurate localization, identification and segmentation of vertebrae is an important task in medical and biological image analysis. The prevailing approach to solve such a task is to first generate pixelindependent features for each vertebra, e.g. via a random forest predictor, which are then fed into an MRF-based objective to infer the optimal MAP solution of a constellation model. We abandon this static, twostage approach and mix feature generation with model-based inference in a new, more flexible, way. We evaluate our method on two data sets with different objectives. The first is semantic segmentation of a 21-part body plan of zebrafish embryos in microscopy images, and the second is localization and identification of vertebrae in benchmark human CT.

1

Introduction

State-of-the-art approaches for object localization or semantic segmentation typically employ pixel-wise forest predictors combined with MAP inference on a graphical constellation model [1,2,3] or a (super-)pixel graph [4,5], respectively. A recent trend in computer vision replaces single-level forest predictors by deep, cascaded models for feature generation, such as CNNs [6] and Auto-Context Models [7]. These models play the role of learning a complex non-linear mapping from images to features that are relevant for the task at hand. This modeling framework is however static, as it separates feature generation from inference (i.e., “model fitting”). It has been shown that better features can be generated by interleaving feature generation with MAP inference [8,9,10].1 In this work we take this idea a step further: Instead of interleaving feature generation with a pixel-level structured model or model-agnostic smoothing, we   1

Shared first authors. Shared last authors. Note that this is conceptually different from the classical “hierarchical” approach that, purely for the sake of pruning the search space to reduce run-time, performs feature generation and inference/model fitting multiple times on different scales.

c Springer International Publishing Switzerland 2015  N. Navab et al. (Eds.): MICCAI 2015, Part I, LNCS 9349, pp. 653–660, 2015. DOI: 10.1007/978-3-319-24553-9_80

654

D. Richmond et al.

Fig. 1. Our proposed pipeline for multi-class, semantic segmentation. A stack of feature images is created by a standard filter bank, and used to train a random forest classifier. The random forest output is then used in combination with the original image to generate candidate segmentations for each class, by fitting multiple instances of appearance models. These candidate segmentations are weighted by means of probabilistic inference in a constellation model that captures relative locations of classes. The weighted and fused candidate segmentations are then fed back as additional “smoothed” features into a new random