Fast 6D Pose Estimation from a Monocular Image Using Hierarchical Pose Trees

It has been shown that the template based approaches could quickly estimate 6D pose of texture-less objects from a monocular image. However, they tend to be slow when the number of templates amounts to tens of thousands for handling a wider range of 3D ob

  • PDF / 5,163,669 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 64 Downloads / 262 Views

DOWNLOAD

REPORT


OMRON Corporation, Kyoto, Japan {ykoni,hanzawa,kawade}@ari.ncl.omron.co.jp 2 Chukyo University, Nagoya, Japan [email protected]

Abstract. It has been shown that the template based approaches could quickly estimate 6D pose of texture-less objects from a monocular image. However, they tend to be slow when the number of templates amounts to tens of thousands for handling a wider range of 3D object pose. To alleviate this problem, we propose a novel image feature and a treestructured model. Our proposed perspectively cumulated orientation feature (PCOF) is based on the orientation histograms extracted from randomly generated 2D projection images using 3D CAD data, and the template using PCOF explicitly handle a certain range of 3D object pose. The hierarchical pose trees (HPT) is built by clustering 3D object pose and reducing the resolutions of templates, and HPT accelerates 6D pose estimation based on a coarse-to-fine strategy with an image pyramid. In the experimental evaluation on our texture-less object dataset, the combination of PCOF and HPT showed higher accuracy and faster speed in comparison with state-of-the-art techniques.

Keywords: 6D pose estimation matching

1

·

Texture-less objects

·

Template

Introduction

Fast and accurate 6D pose estimation of object instances is one of the most important computer vision technologies for various robotic applications both for industrial and consumer robots. In recent years, low-cost 3D sensors such as Microsoft Kinect became popular and they have often been used for object detection and recognition in academic research. However, much more reliability and durability are required for sensors in industrial applications than in consumer applications. Thus the 3D sensors for industry are often far more expensive, Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46448-0 24) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016  B. Leibe et al. (Eds.): ECCV 2016, Part I, LNCS 9905, pp. 398–413, 2016. DOI: 10.1007/978-3-319-46448-0 24

Fast 6D Pose Estimation Using Hierarchical Pose Trees

399

Fig. 1. Our new template based algorithm can estimate 6D pose of texture-less and shiny objects from a monocular image which contains cluttered backgrounds and partial occlusions. It takes an average of approximately 150 ms on a single CPU core.

larger in size and heavier than the consumer ones. Additionaly, most of 3D sensors even for industry cannot handle objects with specular surfaces, are sensitive to illumination conditions and require cumbersome 3D calibrations. For those reasons, monocular cameras are mainly used in the current industrial applications, and fast and accurate 6D pose estimation from a monocular image is still an important technique. Many of industrial parts and products have little texture on their surfaces, and they are so-called texture-less objects. Object detection methods based on keypoints and local descriptors such as SIFT [1] and S