ObjectNet3D: A Large Scale Database for 3D Object Recognition

We contribute a large scale database for 3D object recognition, named ObjectNet3D, that consists of 100 categories, 90,127 images, 201,888 objects in these images and 44,147 3D shapes. Objects in the 2D images in our database are aligned with the 3D shape

  • PDF / 3,638,063 Bytes
  • 17 Pages / 439.37 x 666.142 pts Page_size
  • 33 Downloads / 195 Views

DOWNLOAD

REPORT


Abstract. We contribute a large scale database for 3D object recognition, named ObjectNet3D, that consists of 100 categories, 90,127 images, 201,888 objects in these images and 44,147 3D shapes. Objects in the 2D images in our database are aligned with the 3D shapes, and the alignment provides both accurate 3D pose annotation and the closest 3D shape annotation for each 2D object. Consequently, our database is useful for recognizing the 3D pose and 3D shape of objects from 2D images. We also provide baseline experiments on four tasks: region proposal generation, 2D object detection, joint 2D detection and 3D object pose estimation, and image-based 3D shape retrieval, which can serve as baselines for future research using our database. Our database is available online at http://cvgl.stanford.edu/projects/objectnet3d. Keywords: Database construction

1

· 3D object recognition

Introduction

Recognizing 3D properties of objects from 2D images, such as 3D location, 3D pose and 3D shape, is a central problem in computer vision that has wide applications in different scenarios including robotics, autonomous driving and augmented reality. In recent years, remarkable progress has been achieved on 3D object recognition (e.g. [9,15,24,33,39,42]), as the field has benefited from the introduction of several important databases that provide 3D annotations to 2D objects. For example, the NYU Depth dataset [29] associates depth to 2D images; the KITTI dataset for autonomous driving [10] aligns 2D images with 3D point clouds, and the PASCAL3D+ dataset [40] aligns 2D objects in images with 3D CAD models. With the provided 3D information, supervised learning techniques can be applied to recognize 3D properties of objects. In addition, these datasets serve as benchmarks for comparing different approaches. However, the existing databases with 3D annotations are limited in scale, either in the number of object categories or in the number of images. At least, Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46484-8 10) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016  B. Leibe et al. (Eds.): ECCV 2016, Part VIII, LNCS 9912, pp. 160–176, 2016. DOI: 10.1007/978-3-319-46484-8 10

ObjectNet3D: A Large Scale Database for 3D Object Recognition

161

they are not comparable to large scale 2D image databases such as ImageNet [1] or Microsoft COCO [21]. After witnessing the progress on image classification, 2D object detection and segmentation with the advance of such large scale 2D image databases, we believe that a large scale database with 3D annotations would significantly benefit 3D object recognition.

Fig. 1. An example image in our database with 2D objects aligned with 3D shapes. The alignment enables us to project each 3D shape to the image where its projection overlaps with the 2D object as shown in the image on the right

In this work, we contribute a large scale database for 3D object recognition, named ObjectNet3D, that consis