Sensor Fusion for Sparse SLAM with Descriptor Pooling
This paper focuses on the advancement of a monocular sparse- SLAM algorithm via two techniques: Local feature maintenance and descriptor-based sensor fusion. We present two techniques that maintain the descriptor of a local feature: Pooling and bestfit. T
- PDF / 904,237 Bytes
- 13 Pages / 439.37 x 666.142 pts Page_size
- 69 Downloads / 222 Views
Abstract. This paper focuses on the advancement of a monocular sparse- SLAM algorithm via two techniques: Local feature maintenance and descriptor-based sensor fusion. We present two techniques that maintain the descriptor of a local feature: Pooling and bestfit. The maintenance procedure aims at defining more accurate descriptors, increasing matching performance and thereby tracking accuracy. Moreover, sensors besides the camera can be used to improve tracking robustness and accuracy via sensor fusion. State-of-the-art sensor fusion techniques can be divided into two categories. They either use a Kalman filter that includes sensor data in its state vector to conduct a posterior pose update, or they create world-aligned image descriptors with the help of the gyroscope. This paper is the first to compare and combine these two approaches. We release a new evaluation dataset which comprises 21 scenes that include a dense ground truth trajectory, IMU data, and camera data. The results indicate that descriptor pooling significantly improves pose accuracy. Furthermore, we show that descriptor-based sensor fusion outperforms Kalman filter-based approaches (EKF and UKF).
1
Introduction
Handhelds are ubiquitous and are usually equipped with a video camera which enables the integration of simultaneous localization and mapping (SLAM). Handhelds also include additional sensors, the inertial measurement units (IMUs), which can improve the SLAM accuracy [1]. The combination of the video capture and the additional sensor data requires a multi-sensor fusion. This is commonly achieved by Kalman filters [2–4]. Besides the Kalman filter approaches, a vision-based approach exists that improves the image descriptor via the gyroscope data [5]. This work compares the sensor fusion via an unscented Kalman filter (UKF) with the sensor fusion via gravity-aligned feature descriptors (GAFD) [5]. Both approaches are integrated into the parallel tracking and mapping (PTAM) algorithm [6]. Furthermore, we change the patch-based PTAM matching to a descriptor based matching, e.g., SIFT [7]. Image feature detection aims at detecting salient c Springer International Publishing Switzerland 2016 G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part III, LNCS 9915, pp. 698–710, 2016. DOI: 10.1007/978-3-319-49409-8 58
Sensor Fusion for Sparse SLAM with Descriptor Pooling
699
positions in images at which descriptors are extracted that are robust in terms of scale and rotation. This modification allows us to propose two new descriptor maintenance techniques for an improved matching and tracking accuracy. In summary, the contributions of this work are the following: (a) a new dataset for the evaluation of SLAM algorithms, (b) a new descriptor maintenance technique for higher pose accuracy, (c) a two-way sensor fusion technique by combining UKF with GAFD.
2
Related Work
The PTAM [6] algorithm belongs to the keyframe-based monocular SLAM methods. It differs from the filtering-based approaches [8]: The knowledge of the system is not represented by a probabi
Data Loading...