Robust and Accurate Line- and/or Point-Based Pose Estimation without Manhattan Assumptions

Usual Structure from Motion techniques based on feature points have a hard time on scenes with little texture or presenting a single plane, as in indoor environments. Line segments are more robust features in this case. We propose a novel geometrical crit

  • PDF / 1,395,412 Bytes
  • 18 Pages / 439.37 x 666.142 pts Page_size
  • 86 Downloads / 151 Views

DOWNLOAD

REPORT


´ LIGM, UMR 8049, Ecole des Ponts, UPE, Champs-sur-marne, France {yohann.salaun,renaud.marlet,pascal.monasse}@enpc.fr 2 CentraleSup´elec, Chˆ atenay-Malabry, France

Abstract. Usual Structure from Motion techniques based on feature points have a hard time on scenes with little texture or presenting a single plane, as in indoor environments. Line segments are more robust features in this case. We propose a novel geometrical criterion for two-view pose estimation using lines, that does not assume a Manhattan world. We also define a parameterless (a contrario) RANSAC-like method to discard calibration outliers and provide more robust pose estimations, possibly using points as well when available. Finally, we provide quantitative experimental data that illustrate failure cases of other methods and that show how our approach outperforms them, both in robustness and precision.

1

Introduction

Structure from Motion (SfM) techniques are now able to reliably recover the relative pose of cameras (external calibration) in many common settings, enabling 3D reconstruction from images as well as robotic navigation (SLAM). However, they still have a hard time in a number of practical situations, in particular in indoor environments, where surfaces are mainly planar with little or no texture. The fact is SfM techniques are mostly based on the detection of salient points, and such points are scarce in indoor settings and may occur in degenerate configurations, on a single plane. As a result, camera calibration can fail or yield inaccurate pose estimation. Furthermore, a number of 3D reconstruction applications call for a reduced number of images to lower the acquisition burden. For instance, when a whole building is to be captured to generate a building information model (BIM), being able to only take a few pictures per room is more cost effective. It may even be compulsory for renovation companies, that have only a short and limited access to a building before submitting a tender. In this commercial stage, they do not look for the most accurate 3D information but for one that is easy to Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46478-7 49) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016  B. Leibe et al. (Eds.): ECCV 2016, Part VII, LNCS 9911, pp. 801–818, 2016. DOI: 10.1007/978-3-319-46478-7 49

802

Y. Sala¨ un et al.

Fig. 1. To register two images, we use the relation between reprojected parallel 3D lines. It allows a more robust and accurate calibration in indoor scenes when points fail to calibrate.

capture and reliable enough to construct a sound bid. Some other companies also propose 3D tools and services to rethink the layout of rooms, possibly placing furniture advertisement too. For private individuals not to be dissuaded to run into this process, it must be easy for them to get a well approximated 3D view of their accommodation using only a few pictures. But lowering the number of images means that th