Deep Learning Based 3D Vision

PDF / 199,584 Bytes
9 Pages / 504.467 x 719.855 pts Page_size
115 Downloads / 432 Views

Deep Learning Based 3D Vision In So Kweon and Francois Rameau KAIST, Daejeon, South-Korea

Synonyms Machine learning for 3D vision

Related Concepts 3D Reconstruction CNN Deep Learning Multi-view Stereo Stereo Matching Structure-From-Motion Visual Localization Visual Odometry

Definition The field of 3D vision covers a large range of techniques developed to estimate 3D information from one or multiple images, such as the absolute or relative pose of a camera and the 3D structure of the scene. For instance, among 3D vision problems, visual odometry and visual localization consist in estimating the pose of a camera in the environment, while stereo matching © Springer Nature Switzerland AG 2020 K. Ikeuchi (ed.), Computer Vision, https://doi.org/10.1007/978-3-030-03243-2_848-1

and multi-view stereo aim to reconstruct the 3D structure of the scene from different viewpoints. Conventional techniques developed to solve 3D vision tasks traditionally rely on low-level image features, such as sparse keypoints or dense photometric matching. These strategies are efficient under favorable conditions but tend to be sensitive to poorly textured scenes and usually do not encapsulate contextual and semantic information. Conversely, deep learning techniques are known for their ability to extract high-level features carrying relevant and useful information. Deep learning based 3D vision attempts to take advantage of deep neural network architectures to improve the robustness and reliability of 3D vision tasks under challenging contexts.

Deep Visual Odometry/SLAM Visual odometry (VO) consists in the joint estimation of the camera motion and the local structure of the scene using one or multiple cameras. Taking advantage of deep learning in the context of VO is particularly relevant to improve its robustness and applicability. Indeed, conventional techniques are still limited in their ability to reconstruct 3D at metric scale (in the monocular case) and remain particularly inefficient in texture-less environments where semantic cues can play a crucial role. The first techniques applying deep learning to the relative pose estimation problem focus on the motion and depth estimation between two

2

images. It is, for instance, the case of DeMoN [1] which iteratively refines the optical flow and the motion estimation of the sensor through a succession of encoder-decoder networks. This work led to the development of multiple analogous approaches dedicated to solve the motion and depth in an end-to-end manner. If these techniques are efficient, they are not specifically designed for visual odometry and tend to accumulate an important drift when a large sequence of images is processed. More ad hoc solutions have been designed to cope with the drift issues by imposing temporal constraints in the motion estimation process. The first attempt to incorporate such an a priori is DeepVO [2] which proposes to employ a Recurrent Convolutional Neural Network (RCNN) to learn the motion dynamics of the sensor and to impose sequential in

Data Loading...

Deep Learning Based 3D Vision

Recommend Documents

Deep Learning on Windows Building Deep Learning Computer Vision

Deep Learning on 3D Data

A Real Time Vision System Based on Deep Learning for Gesture Based Human Machine Interaction

Deep Point Cloud Odometry: A Deep Learning Based Odometry with 3D Laser Point Clouds

Stacking Deep Learning for Early COVID-19 Vision Diagnosis

Vision-Based Autonomous Navigation with Evolutionary Learning

Domain Adaptation in Computer Vision with Deep Learning

Deep Computer Vision

Industrial Electrical Automation Control System Based on Machine Vision and Deep Learning

Deep Learning-Based Recommender Systems

Intelligent Control Location Detection System Based on Machine Vision and Deep Learning

Research of status recognition of Fiber transfer box based on machine vision and deep learning