Patch attention network with generative adversarial model for semi-supervised binocular disparity prediction

PDF / 4,904,364 Bytes
17 Pages / 595.276 x 790.866 pts Page_size
22 Downloads / 223 Views

ORIGINAL ARTICLE

Patch attention network with generative adversarial model for semi-supervised binocular disparity prediction Zhibo Rao1

· Mingyi He1 · Yuchao Dai1 · Zhelun Shen2

Accepted: 16 October 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract In this paper, we address the challenging points of binocular disparity estimation: (1) unsatisfactory results in the occluded region when utilizing warping function in unsupervised learning; (2) inefficiency in running time and the number of parameters as adopting a lot of 3D convolutions in the feature matching module. To solve these drawbacks, we propose a patch attention network for semi-supervised stereo matching learning. First, we employ a channel-attention mechanism to aggregate the cost volume by selecting its different surfaces for reducing a large number of 3D convolution, called the patch attention network (PA-Net). Second, we use our proposed PA-Net as a generator and then combine it, traditional unsupervised learning loss, and the adversarial learning model to construct a semi-supervised learning framework for improving performance in the occluded areas. We have trained our PA-Net in supervised learning, semi-supervised learning, and unsupervised learning manners. Extensive experiments show that (1) our semi-supervised learning framework can overcome the drawbacks of unsupervised learning and significantly improve the performance in the ill-posed region by using only a few or inaccurate ground truths; (2) our PA-Net can outperform other state-of-the-art approaches in supervised learning and use fewer parameters. Keywords Binocular disparity estimation · Semi-supervised learning · Patch attention mechanism · Generative adversarial model

1 Introduction Stereo matching is fundamental research in computer vision applications, such as autonomous driving [26,41], robot navigation [3,31,38], and 3D reconstruction [13,20]. It aims to estimate the disparity map by matching pixels between a pair of rectified images [22]. Following the groundbreaking work of deep learning, current state-of-the-art stereo matching methods employ deep convolutional neural networks (CNNs) to regress a dense disparity map [4,8,18]. From the perspective of the network structure, the model can be decomposed into three modules: feature extraction, feature matching, and disparity regression [5,50]. Among them, the feature matching module is a crucial step to obtain accurate disparity estimation. In recent years, 3D convolution

B B

Zhibo Rao raoxi36@foxmail.com Mingyi He myhe@nwpu.edu.cn

1

Northwestern Polytechnical University, Xian 710129, China

2

Peking University, Beijing 100871, China

operation is often used to build the relationship among disparity, height, width, and feature dimensions [4,18]. The results indicate that the 3D convolution operation can enhance the geometry learning ability and improve the matching accuracy in the occlusions. However, it also brings the computation cost problem by using a lot of 3D convolution operations to down-sampli

Data Loading...

Patch attention network with generative adversarial model for semi-supervised binocular disparity prediction

Recommend Documents

Generative Adversarial Network

Attentive evolutionary generative adversarial network

A unified generative model using generative adversarial network for activity recognition

Infrared image super-resolution reconstruction by using generative adversarial network with an attention mechanism

Generative adversarial network as a stochastic subsurface model reconstruction

LDGAN: Longitudinal-Diagnostic Generative Adversarial Network for Disease Progression Prediction with Missing Structural

TopoGAN: A Topology-Aware Generative Adversarial Network

Motion-Constrained Generative Adversarial Network for Anomaly Detection

End-to-End Generative Adversarial Network for Palm-Vein Recognition

An Improved Conditional Generative Adversarial Network for Microarray Data

TagRec: Trust-Aware Generative Adversarial Network with Recurrent Neural Network for Recommender Systems

AFPun-GAN: Ambiguity-Fluency Generative Adversarial Network for Pun Generation