Real-Time Semantic Segmentation with Label Propagation

Despite of the success of convolutional neural networks for semantic image segmentation, CNNs cannot be used for many applications due to limited computational resources. Even efficient approaches based on random forests are not efficient enough for real-

  • PDF / 4,148,774 Bytes
  • 12 Pages / 439.37 x 666.142 pts Page_size
  • 89 Downloads / 197 Views

DOWNLOAD

REPORT


Abstract. Despite of the success of convolutional neural networks for semantic image segmentation, CNNs cannot be used for many applications due to limited computational resources. Even efficient approaches based on random forests are not efficient enough for real-time performance in some cases. In this work, we propose an approach based on superpixels and label propagation that reduces the runtime of a random forest approach by factor 192 while increasing the segmentation accuracy.

1

Introduction

Although convolutional neural networks have shown a great success for semantic image segmentation in the last years [1–3], fast inference can only be achieved by massive parallelism as offered by modern GPUs. For many applications like mobile platforms or unmanned aerial vehicles, however, the power consumption matters and GPUs are often not available. A server-client solution is not always an option due to latency and limited bandwidth. There is therefore a need for very efficient approaches that segment images in real-time on single-threaded architectures. In this work, we analyze in-depth how design choices affect the accuracy and runtime of random forests and propose an efficient superpixel-based approach with label propagation for videos. As illustrated in Fig. 1, we use a very efficient quadtree representation for superpixels. The superpixels are then classified by random forests. For classification, we investigate two methods. For the first method, we use the empirical class distribution and for the second method we model the spatial distributions of class labels by Gaussians. For video data, we propose label propagation to reduce the runtime without substantially decreasing the segmentation accuracy. An additional spatial smoothing even improves the accuracy. We evaluate our approach on the CamVid dataset [4]. Compared to a standard random forest, we reduce the runtime by factor 192 while increasing the global pixel accuracy by 4 % points. A comparison with state-of-the-art approaches in terms of accuracy shows that the accuracy of our approach is competitive while achieving real-time performance on a single-threaded architecture.

c Springer International Publishing Switzerland 2016  G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part II, LNCS 9914, pp. 3–14, 2016. DOI: 10.1007/978-3-319-48881-3 1

4

R. Sheikh et al.

Fig. 1. For efficient segmentation, we use a quadtree to create superpixels and classify the superpixels by a random forests.

2

Related Work

A popular approach for semantic segmentation uses a variety of features like appearance, depth, or edges and classifies each pixel by a classifier like random forest or boosting [4,5]. Since pixel-wise classification can be very noisy, conditional random fields have been used to model the spatial relations of pixels and obtain a smooth segmentation [6,7]. Conditional random fields, however, are too expensive for many applications. In [8], a structured random forest has been proposed that predicts not a single label per pixel but the labels of the entire neighborhood. Me