Uncalibrated multi-view multiple humans association and 3D pose estimation by adversarial learning

PDF / 7,924,244 Bytes
28 Pages / 439.642 x 666.49 pts Page_size
106 Downloads / 246 Views

Uncalibrated multi-view multiple humans association and 3D pose estimation by adversarial learning Sara Ershadi-Nasab1 · Shohreh Kasaei2

· Esmaeil Sanaei1

Received: 7 November 2019 / Revised: 10 August 2020 / Accepted: 26 August 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Multiple human 3D pose estimation is a useful but challenging task in computer vison applications. The ambiguities in estimation of 2D and 3D poses of multiple persons can be verified by using multi-view frames, in which the occluded or self-occluded body parts of some persons might be visible in other camera views. But, when cameras are moving and uncalibrated, estimating the association of multiple human body parts among different camera views is a challenging task. This paper presents novel methods for multiple human 3D pose estimation and pose association in multi-view camera frames in an uncalibrated camera setup using an adversarial learning framework. The generator is a 3D pose estimation network that learns a mapping of distance and angular difference matrices between 2D and 3D spaces. The discriminator tries to distinguish the predicted 3D poses from the groundtruth, which helps to enforce the pose estimator to generate valid 3D poses. To increase the accuracy of the generator network, multi-view frames are used. The estimated 3D poses are associated among multi-view frames by a statistical method. The association and relative rotation and translation of cameras to each other are also obtained. This step empowers the generator network and removes ambiguities in the estimation of occluded or self-occluded body parts. The global 3D poses are the inputs to the discriminator network to imposter the discriminator that they come from the ground-truth. Experimental results conducted on multi-view and multi-person datasets (such as Campus, Shelf, Utrecht Multi-Person Motion (UMPM), and also KTH Football 2) indicate that the proposed method achieves superior performance in comparison with other state-of-the-art methods while it does require any calibration information in priori. Keywords 3D pose estimation · Multi-view · Human associations · Uncalibrated cameras · Generative adversarial

1 Introduction For better readability of the paper, the abbreviation list is provided in Table 1. Shohreh Kasaei

[email protected]

Extended author information available on the last page of the article.

Multimedia Tools and Applications Table 1 Abreviation list Full form

Acronyms

Procrustes analysis

PA

Generative adversarial network

GAN

Euclidean distance matrix

EDM

Angular difference matrix

ADM

Convolutional neural network

CNN

3D Pictorial structure

3DPS

Ctructured support vector machine

SSVM

Expectation-Maximization

EM

Utrecht Multi-Person motion

UMPM

Singular value decomposition

SVD

Percentage of correct estimated parts

PCP

Batch normalization

BN

Rectified linear unit

ReLU

3D human pose estimation suffers from the difficulty of gathering 3D ground-truth. While gathering large-scale 2D

Data Loading...

Uncalibrated multi-view multiple humans association and 3D pose estimation by adversarial learning

Recommend Documents

3D Pose Estimation

GARNet: Graph Attention Residual Networks Based on Adversarial Learning for 3D Human Pose Estimation

Learning Markerless Human Pose Estimation from Multiple Viewpoint Video

3D Fetal Pose Estimation with Adaptive Variance and Conditional Generative Adversarial Network

3D Human Pose Estimation with 2D Human Pose and Depthmap

Motion Guided 3D Pose Estimation from Videos

Towards Viewpoint Invariant 3D Human Pose Estimation

Bayesian Image Based 3D Pose Estimation

An Effective Multiview Stereo Method for Uncalibrated Images

Adversarial Semantic Data Augmentation for Human Pose Estimation

3D Human Body Shape and Pose Estimation from Depth Image

Human Pose Estimation in Space and Time Using 3D CNN