Fine semantic mapping based on dense segmentation network

  • PDF / 3,156,232 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 73 Downloads / 250 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH PAPER

Fine semantic mapping based on dense segmentation network Guoyu Zuo1,2

· Tao Zheng1,2 · Yuelei Liu1,2 · Zichen Xu1,2 · Daoxiong Gong1,2 · Jianjun Yu1,2

Received: 5 December 2019 / Accepted: 19 October 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract This paper proposes a fine semantic mapping method using dense segmentation network (DS-Net) to obtain good performance of semantic mapping fusion. First, the RGB image and the depth image are used to generate a dense indoor scene map via the state-of-the-art dense SLAM (ElasticFusion). Then, the DS-Net is constructed based on DenseNet’s dense connection to perform precise semantic segmentation on the input RGB image. Finally, the long-term correspondence is established between the indoor scene map and the landmarks using continuous frames both in the visual odometer and in loop detection, and the final semantic map is obtained by fusing the indoor scene map with the semantic predictions of the RGB-D video frames of multiple angles. Experiments were performed on the NYUv2, PASCAL VOC 2012, CIFAR10 datasets and our laboratory environments. Results show that our method can reduce the error in dense map construction and obtain good semantic segmentation performance. Keywords Semantic segmentation · RGB-D · DS-Net · DenseNet · ElasticFusion

1 Introduction In the fields of robotics and computer vision, semantic map lays a foundation for realizing human–robot interaction and human–robot fusion, and it is widely used in robot navigation, robot manipulation and augmented reality. It is always an important research issue to construct an incremental and robust semantic map in real time. Owing to the rapid development of simultaneous localization and mapping (SLAM), the robot can use sparse or dense point clouds to map environments to the 2D or 3D grid maps [1]. Great achievements have been made in autonomous navigation and automatic obstacle avoidance by using geometric map and feature map. However, the robot agent cannot get more from these maps which only contain geometric and point cloud information. So, it is difficult and even impossible for the robot to understand complex environments and difficult to be competent for more complex tasks. In order to realize friendly understanding of complex environments, semantic map that integrates semantic and geometric information must be established to

B

Guoyu Zuo [email protected]

1

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China

2

Beijing Key Laboratory of Computing Intelligence and Intelligent Systems, Beijing 100124, China

improve the ability of the robot in path planning and other more sophisticated tasks. Currently, semantic mapping framework has two main parts: semantic segmentation performed by convolutional neural network (CNN) and map construction based on SLAM. Some CNN-based methods focus on improving the accuracy of semantic segmentation [2,3]. However, to maximally extract the information in the maps, we often deepen