Indoor scene understanding via RGB-D image segmentation employing depth-based CNN and CRFs

PDF / 1,081,216 Bytes
15 Pages / 439.37 x 666.142 pts Page_size
28 Downloads / 210 Views

Indoor scene understanding via RGB-D image segmentation employing depth-based CNN and CRFs Wei Li 1,2,3

& Junhua Gu

4,5

& Yongfeng Dong

4,5

& Yao Dong

4,5

& Jungong Han

6

Received: 1 February 2019 / Revised: 5 April 2019 / Accepted: 10 June 2019 # Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract With the availability of low-cost depth-visual sensing devices, such as Microsoft Kinect, we are experiencing a growing interest in indoor environment understanding, at the core of which is semantic segmentation in RGB-D image. The latest research shows that the convolutional neural network (CNN) still dominates the image semantic segmentation field. However, downsampling operated during the training process of CNNs leads to unclear segmentation boundaries and poor classification accuracy. To address this problem, in this paper, we propose a novel end-to-end deep architecture, termed FuseCRFNet, which seamlessly incorporates a fully-connected Conditional Random Fields (CRFs) model into a depth-based CNN framework. The proposed segmentation method uses the properties of pixel-to-pixel relationships to increase the accuracy of image semantic segmentation. More importantly, we formulate the CRF as one of the layers in FuseCRFNet to refine the coarse segmentation in the forward propagation, in meanwhile, it passes back the errors to facilitate the training. The performance of our FuseCRFNet is evaluated by experimenting with SUN RGB-D dataset, and the results show that the proposed algorithm is superior to existing semantic segmentation algorithms with an improvement in accuracy of at least 2%, further verifying the effectiveness of the algorithm. Keywords Sematic segmentation . CNNs . RGB-D . Fully-connected conditional random field

1 Introduction In recent years, computer vision research has been thriving due to the influence of deep learning and big data [51]. The focus of research has shifted from simple object recognition to scene understanding, which plays an important role in many fields such as autonomous driving, human-computer interaction, and intelligent robots [20]. Techniques for achieving more accurate analysis and understanding of the environment have become popular, in which

* Wei Li [email protected] Extended author information available on the last page of the article

Multimedia Tools and Applications

semantic segmentation has always been the most crucial task. With the aid of deep learning, the process of image semantic segmentation is changing from manual labeling to automatic recognition. Especially after the convolutional neural network (CNN) achieved a breakthrough in image classification, computer vision has been heavily influenced by deep learning. There are now many excellent neural networks available for image segmentation specifically, such as FCN [23], SegNet [2], and DeepLab [7]. However, most of the semantic segmentation models are performed on RGB images, which are limited due to the lack of capability to handle geometric information. Since more and mor

Data Loading...

Indoor scene understanding via RGB-D image segmentation employing depth-based CNN and CRFs

Recommend Documents

Facilitating and Exploring Planar Homogeneous Texture for Indoor Scene Understanding

Procedural Content Generation via Machine Learning in 2D Indoor Scene

Comprehensive Image Captioning via Scene Graph Decomposition

CNN-GCN Aggregation Enabled Boundary Regression for Biomedical Image Segmentation

Efficient CNN-CRF Network for Retinal Image Segmentation

Review on the Methodologies for Image Segmentation Based on CNN

CNN, Segmentation or Semantic Embeddings: Evaluating Scene Context for Trajectory Prediction

Image Segmentation

Unsupervised Learning for CT Image Segmentation via Adversarial Redrawing

Retinal Image Quality Assessment via Specific Structures Segmentation

Cellular/Vascular Reconstruction Using a Deep CNN for Semantic Image Preprocessing and Explicit Segmentation

Attention-Based Network for Semantic Image Segmentation via Adversarial Learning