Single image 3D object reconstruction based on deep learning: A review

  • PDF / 2,481,394 Bytes
  • 36 Pages / 439.37 x 666.142 pts Page_size
  • 68 Downloads / 343 Views

DOWNLOAD

REPORT


Single image 3D object reconstruction based on deep learning: A review Kui Fu 1 & Jiansheng Peng 1,2

1

& Qiwen He & Hanxiao Zhang

2

Received: 1 January 2020 / Revised: 19 August 2020 / Accepted: 25 August 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

The reconstruction of 3D object from a single image is an important task in the field of computer vision. In recent years, 3D reconstruction of single image using deep learning technology has achieved remarkable results. Traditional methods to reconstruct 3D object from a single image require prior knowledge and assumptions, and the reconstruction object is limited to a certain category or it is difficult to accomplish a good reconstruction from a real image. Although deep learning can solve these problems well with its own powerful learning ability, it also faces many problems. In this paper, we first discuss the challenges faced by applying the deep learning method to reconstruct 3D objects from a single image. Second, we comprehensively review encoders, decoders and training details used in 3D reconstruction of a single image. Then, the common datasets and evaluation metrics of single image 3D object reconstruction in recent years are introduced. In order to analyze the advantages and disadvantages of different 3D reconstruction methods, a series of experiments are used for comparison. In addition, we simply give some related application examples involving 3D reconstruction of a single image. Finally, we summarize this paper and discuss the future directions. Keywords Single image 3D reconstruction . Deep learning . Computer vision . 3D shape representation

Kui Fu and Jiansheng Peng contributed equally to this work.

* Jiansheng Peng [email protected]

1

School of Physics and Mechanical and Electronic Engineering, Hechi University, Yizhou, Guangxi 546300, China

2

School of Electrical and Information Engineering, Guangxi University of Science and Technology, Liuzhou, Guangxi 545006, China

Multimedia Tools and Applications

1 Introduction Three-dimensional reconstruction of images is a common topic in computer vision, medical image processing [74, 4] and virtual reality [109]. The main purpose of theory and technology related to computer vision is to obtain information from images or multi-dimensional data to establish artificial intelligence systems. 3D reconstruction of images is one of the main tasks of computer vision, and its purpose is to study the generation of corresponding 3D structures from a single image or multiple images [93, 82]. According to the different reconstruction targets, the 3D reconstruction of images can be divided into 3D scene reconstruction and 3D object reconstruction. A big challenge for single-view 3D scene reconstruction is to predict invisible parts from a single image [38, 108, 100]. Multi-view 3D scene reconstruction [36, 39] and multi-view 3D object reconstruction [18] can integrate the information of multiple images to compensate for the defect of single image prediction uncertaint