Higher Order Conditional Random Fields in Deep Neural Networks
We address the problem of semantic segmentation using deep learning. Most segmentation systems include a Conditional Random Field (CRF) to produce a structured output that is consistent with the image’s visual features. Recent deep learning approaches hav
- PDF / 1,403,577 Bytes
- 17 Pages / 439.37 x 666.142 pts Page_size
- 34 Downloads / 229 Views
Abstract. We address the problem of semantic segmentation using deep learning. Most segmentation systems include a Conditional Random Field (CRF) to produce a structured output that is consistent with the image’s visual features. Recent deep learning approaches have incorporated CRFs into Convolutional Neural Networks (CNNs), with some even training the CRF end-to-end with the rest of the network. However, these approaches have not employed higher order potentials, which have previously been shown to significantly improve segmentation performance. In this paper, we demonstrate that two types of higher order potential, based on object detections and superpixels, can be included in a CRF embedded within a deep network. We design these higher order potentials to allow inference with the differentiable mean field algorithm. As a result, all the parameters of our richer CRF model can be learned end-to-end with our pixelwise CNN classifier. We achieve state-of-the-art segmentation performance on the PASCAL VOC benchmark with these trainable higher order potentials. Keywords: Semantic segmentation · Conditional random fields learning · Convolutional Neural Networks
1
· Deep
Introduction
Semantic segmentation involves assigning a visual object class label to every pixel in an image, resulting in a segmentation with a semantic meaning for each segment. While a strong pixel-level classifier is critical for obtaining high accuracy in this task, it is also important to enforce the consistency of the semantic segmentation output with visual features of the image. For example, segmentation boundaries should usually coincide with strong edges in the image, and regions in the image with similar appearance should have the same label. Recent advances in deep learning have enabled researchers to create stronger classifiers, with automatically learned features, within a Convolutional Neural Network (CNN) [1–3]. This has resulted in large improvements in semantic segmentation accuracy on widely used benchmarks such as PASCAL VOC [4]. CNN Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46475-6 33) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part II, LNCS 9906, pp. 524–540, 2016. DOI: 10.1007/978-3-319-46475-6 33
Higher Order Conditional Random Fields in Deep Neural Networks
525
classifiers are now considered the standard choice for pixel-level classifiers used in semantic segmentation. On the other hand, probabilistic graphical models have long been popular for structured prediction of labels, with constraints enforcing label consistency. Conditional Random Fields (CRFs) have been the most common framework, and various rich and expressive models [5–7], based on higher order clique potentials, have been developed to improve segmentation performance. Whilst some deep learning methods showed impressive performance in semantic segmentation without incorporating graphical models [3,8], curr
Data Loading...