Real-Time Monocular Segmentation and Pose Tracking of Multiple Objects
We present a real-time system capable of segmenting multiple 3D objects and tracking their pose using a single RGB camera, based on prior shape knowledge. The proposed method uses twist-coordinates for pose parametrization and a pixel-wise second-order op
- PDF / 4,563,119 Bytes
- 16 Pages / 439.37 x 666.142 pts Page_size
- 89 Downloads / 229 Views
Computer Science Department, RheinMain University of Applied Sciences, Wiesbaden, Germany {henning.tjaden,ulrich.schwanecke}@hs-rm.de 2 Institute of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany [email protected]
Abstract. We present a real-time system capable of segmenting multiple 3D objects and tracking their pose using a single RGB camera, based on prior shape knowledge. The proposed method uses twist-coordinates for pose parametrization and a pixel-wise second-order optimization approach which lead to major improvements in terms of tracking robustness, especially in cases of fast motion and scale changes, compared to previous region-based approaches. Our implementation runs at about 50–100 Hz on a commodity laptop when tracking a single object without relying on GPGPU computations. We compare our method to the current state of the art in various experiments involving challenging motion sequences and different complex objects. Keywords: Tracking · Segmentation · Real-time estimation · Model-based · Shape knowledge
1
·
Monocular
·
Pose
Introduction
Tracking the 3D motion of a rigid object from its 2D projections into image sequences of a single camera is one of the main research areas in computer vision. This involves estimating the pose, i.e. the 3D translation and rotation, of the object relative to the camera in each image. The fields of application for visual 3D object tracking are numerous, such as visual servoing of robots, medical navigation and visualization, sports therapy, augmented reality systems and human computer interaction. Many different solutions to this problem have been developed over the years and are now part of a variety of practical applications. For a survey of monocular 3D tracking of rigid objects see e.g. [1]. Recently so-called region-based pose estimation methods have emerged, which are mainly based on statistical level-set segmentation approaches [2]. Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46493-0 26) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part IV, LNCS 9908, pp. 423–438, 2016. DOI: 10.1007/978-3-319-46493-0 26
424
H. Tjaden et al.
m1 Ic
m2
m3
c
Fig. 1. Example of region-based 3D tracking of three different objects with partial occlusions. All poses are determined by the proposed method within about 30 ms. Left: Augmented reality view of the scene, where the rendered models yield a segmentation of the objects in the image. Right: 3D overview of the scene.
These do not require any kind of artificial marker or other augmentation of the object of interest. They only rely on a 3D model of the object. Therefore, they fall into the category of model-based pose estimation methods. These are very attractive for application scenarios where it is undesirable or even impossible to modify the objects. For example in case of sports therapy, where ergonomics might be affected or for tracking
Data Loading...