A robust visual SLAM system in dynamic man-made environments

  • PDF / 2,774,395 Bytes
  • 9 Pages / 612 x 792 pts (letter) Page_size
  • 96 Downloads / 205 Views

DOWNLOAD

REPORT


Print-CrossMark

https://doi.org/10.1007/s11431-020-1602-3

Special Topic: Industrial Artificial Intelligence

. Article .

A robust visual SLAM system in dynamic man-made environments LIU JiaCheng, MENG ZiYang* & YOU Zheng Department of Precision Instrument, Tsinghua University, Beijing 100084, China Received February 20, 2020; accepted April 14, 2020; published online July 23, 2020

This paper presents a robust visual simultaneous localization and mapping (SLAM) system that leverages point and structural line features in dynamic man-made environments. Manhanttan world assumption is considered and the structural line features in such man-made environments provide rich geometric constraint, e.g., parallelism. Such a geometric constraint can be therefore used to rectify 3D maplines after initialization. To cope with dynamic scenarios, the proposed system are divided into four main threads including 2D dynamic object tracking, visual odometry, local mapping and loop closing. The 2D tracker is responsible to track the object and capture the moving object in bounding boxes. In such a case, the dynamic background can be excluded and the outlier point and line features can be effectively removed. To parameterize 3D lines, we use Pl¨ucker line coordinates in initialization and projection processes, and utilize the orthonormal representation in unconstrained graph optimization process. The proposed system has been evaluated in both benchmark datasets and real-world scenarios, which reveals a more robust performance in most of the experiments compared with the existing state-of-the-art methods. SLAM, Manhanttan world, dynamic scenarios, line feature, graph optimization Citation:

Liu J C, Meng Z Y, You Z. A robust visual SLAM system in dynamic man-made environments. https://doi.org/10.1007/s11431-020-1602-3

1 Introduction Visual simultaneous localization and mapping (V-SLAM) problem consists of estimating the position and orientation of a moving camera while simultaneously constructing a map of the unknown environment. In recent years, V-SLAM has been extensively investigated by two popular frameworks, i.e., optimization-based one and filter-based one. It is also shown that the optimization-based method is more suitable for V-SLAM application due to its robustness and superior accuracy [1]. In practice, the SLAM applications in autonomous vehicles such as micro air vehicles (MAVs), selfdriving cars, and automatic guided vehicle systems (AGVs) often face a challenging scenario that the accurate body pose and environmental map need to be estimated in man*Corresponding author (email: [email protected])

Sci China Tech Sci, 2020, 63,

made scenarios with dynamic objects and low-texture scenes. Classical V-SLAM approaches rely on a set of detected feature correspondences from multiple images in static environments. Therefore, these approaches are usually prone to failure in dynamic environments due to occlusion of previously tracked landmarks or false correspondences. To deal with this challenging problem, refs. [2,3] i