Monocular depth estimation based on deep learning: An overview

PDF / 2,850,131 Bytes
16 Pages / 612 x 792 pts (letter) Page_size
36 Downloads / 274 Views

Print-CrossMark

https://doi.org/10.1007/s11431-020-1582-8

Special Topic: Industrial Artificial Intelligence

. Review .

Monocular depth estimation based on deep learning: An overview ZHAO ChaoQiang, SUN QiYu, ZHANG ChongZhen, TANG Yang* & QIAN Feng Key Laboratory of Advanced Control and Optimization for Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China Received February 27, 2020; accepted March 25, 2020; published online June 10, 2020

Depth information is important for autonomous systems to perceive environments and estimate their own state. Traditional depth estimation methods, like structure from motion and stereo vision matching, are built on feature correspondences of multiple viewpoints. Meanwhile, the predicted depth maps are sparse. Inferring depth information from a single image (monocular depth estimation) is an ill-posed problem. With the rapid development of deep neural networks, monocular depth estimation based on deep learning has been widely studied recently and achieved promising performance in accuracy. Meanwhile, dense depth maps are estimated from single images by deep neural networks in an end-to-end manner. In order to improve the accuracy of depth estimation, diﬀerent kinds of network frameworks, loss functions and training strategies are proposed subsequently. Therefore, we survey the current monocular depth estimation methods based on deep learning in this review. Initially, we conclude several widely used datasets and evaluation indicators in deep learning-based depth estimation. Furthermore, we review some representative existing methods according to diﬀerent training manners: supervised, unsupervised and semi-supervised. Finally, we discuss the challenges and provide some ideas for future researches in monocular depth estimation. autonomous systems, monocular depth estimation, deep learning, unsupervised learning Citation:

Zhao C Q, Sun Q Y, Zhang C Z, et al. Monocular depth estimation based on deep learning: An overview. Sci China Tech Sci, 2020, 63, https://doi.org/10.1007/s11431-020-1582-8

1 Introduction Estimating depth information from images is one of the basic and important tasks in computer vision, which can be widely used in simultaneous localization and mapping (SLAM) [1], navigation [2], object detection [3] and semantic segmentation [4], etc. Geometry-based methods Recovering 3D structures from a couple of images based on geometric constraints is a popular way to perceive depth, and it has been widely investigated in recent forty years. Structure from motion (SfM) [5] is a representative method for estimating 3D structures from a series of 2D image sequences and is applied in 3D reconstruction [6] and SLAM [7] successfully. The depth of sparse

features can be handled by SfM through feature correspondences and geometric constraints between image sequences, i.e., the accuracy of depth estimation relies heavily on the exact feature matching and high-quality image sequences. Furthermore, SfM suﬀers from monocular

Data Loading...

Monocular depth estimation based on deep learning: An overview

Recommend Documents

Linear Depth Estimation from an Uncalibrated, Monocular Polarisation Image

Guiding Monocular Depth Estimation Using Depth-Attention Volume

MD-ST: Monocular Depth Estimation Based on Spatio-Temporal Correlation Features

Monocular Dense 3D Reconstruction Algorithm Based on Inverse Depth Filter

Coarse-to-fine Planar Regularization for Dense Monocular Depth Estimation

FF-GAN: Feature Fusion GAN for Monocular Depth Estimation

Real-time monocular depth estimation with adaptive receptive fields

Multi-loss Rebalancing Algorithm for Monocular Depth Estimation

CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss

Detection and Depth Estimation for Objects from Single Monocular Image

Disambiguating Monocular Depth Estimation with a Single Transient

Pose Estimation for Planar Target Based on Monocular Visual Information