Road segmentation with image-LiDAR data fusion in deep neural network

  • PDF / 1,672,695 Bytes
  • 16 Pages / 439.642 x 666.49 pts Page_size
  • 100 Downloads / 222 Views

DOWNLOAD

REPORT


Road segmentation with image-LiDAR data fusion in deep neural network Huafeng Liu1

· Yazhou Yao1 · Zeren Sun1 · Xiangrui Li1 · Ke Jia2 · Zhenming Tang1

Received: 26 January 2019 / Revised: 16 April 2019 / Accepted: 5 June 2019 / © Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract Robust road segmentation is a key challenge in self-driving research. Though many image based methods have been studied and high performances in dataset evaluations have been reported, developing robust and reliable road segmentation is still a major challenge. Data fusion across different sensors to improve the performance of road segmentation is widely considered an important and irreplaceable solution. In this paper, we propose a novel structure to fuse image and LiDAR point cloud in an end-to-end semantic segmentation network, in which the fusion is performed at decoder stage instead of at, more commonly, encoder stage. During fusion, we improve the multi-scale LiDAR map generation to increase the precision of multi-scale LiDAR map by introducing pyramid projection method. Additionally, we adapted the multi-path refinement network with our fusion strategy and improve the road prediction compared with transpose convolution with skip layers. Our approach has been tested on KITTI ROAD dataset and have a competitive performance. Keywords Road segmentation · Data fusion · Deep learning

1 Introduction With the booming of intelligent transportation system research, autonomous driving technology has gained more and more attentions. Road segmentation, as one of the crucial tasks,

 Huafeng Liu

[email protected] Ke Jia [email protected] Zhenming Tang [email protected] 1

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing City 210094, China

2

School of Computer Science, Chengdu University of Information Technology, Chengdu City, China

Multimedia Tools and Applications

is a basic topic for enabling autonomous ability and mobility [5, 27]. In road segmentation research, various methods have been proposed to find road area in RGB image [9] or 3D LiDAR point cloud [4, 7]. However, the colors, textures and shapes can be very different due to the various illumination condition, weather condition and very different scenes, eventually makes road segmentation still a challenging task. Deep learning is a powerful tool on learning representation in basic multimedia tasks, such as image classification [36, 39], image searching [23–25], image segmentation [16], as well as scene recognition [29–31, 35, 40]. As a common sense, the features used in those tasks have a great impact on final performance, and recently Convolutional Neural Network(CNN) has been demonstrated that, automatic feature learning on massive annotated data surpasses hand-crafted features in many applications. As a result, more and more researchers are trying to exploit Deep Neural Network(DNN) in many fields. Deep convolutional neural networks ,like VGG [26] and Residual Net [13], are used as encoder