GAN-Poser: an improvised bidirectional GAN model for human motion prediction

  • PDF / 611,031 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 31 Downloads / 176 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789(). ,- volV)

S.I.: DEEP LEARNING APPROACHES FOR REALTIME IMAGE SUPER RESOLUTION (DLRSR)

GAN-Poser: an improvised bidirectional GAN model for human motion prediction Deepak Kumar Jain1 • Masoumeh Zareapoor2 • Rachna Jain3 • Abhishek Kathuria3 • Shivam Bachhety3 Received: 16 June 2019 / Accepted: 8 April 2020  Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract A novel method called GAN-Poser has been explored to predict human motion in less time given an input 3D human skeleton sequence based on a generator–discriminator framework. Specifically, rather than using the conventional Euclidean loss, a frame-wise geodesic loss is used for geometrically meaningful and more precise distance measurement. In this paper, we have used a bidirectional GAN framework along with a recursive prediction strategy to avoid modecollapse and to further regularize the training. To be able to generate multiple probable human-pose sequences conditioned on a given starting sequence, a random extrinsic factor H has also been introduced. The discriminator is trained in order to regress the extrinsic factor H, which is used alongside with the intrinsic factor (encoded starting pose sequence) to generate a particular pose sequence. In spite of being in a probabilistic framework, the modified discriminator architecture allows predictions of an intermediate part of pose sequence to be used as conditioning for prediction of the latter part of the sequence. This adversarial learning-based model takes into consideration of the stochasticity, and the bidirectional setup provides a new direction to evaluate the prediction quality against a given test sequence. Our resulting novel method, GAN-Poser, achieves superior performance over the state-of-the-art deep learning approaches when evaluated on the standard NTU-RGB-D and Human3.6 M dataset. Keywords Human motion  GAN  Probability theory  Pose estimation  Sequence model  3D model

1 Introduction An accurate and short (several seconds) predictions of what is going to happen within the world given past events may be an elementary and helpful human ability. Such ability is important for daily activities, social interactions and ultimately survival. As an example, driving needs predicting & Deepak Kumar Jain [email protected] 1

Key Laboratory of Intelligent Air-Ground Cooperative Control for Universities in Chongqing, College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, China

2

School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China

3

Department of Computer Science and Engineering, Bharati Vidyapeeth’s College of Engineering, New Delhi, India

alternative cars’ associated pedestrians’ motions so as to avoid an accident; greeting needs predicting the situation of the opposite person’s hand, and taking part in sports needs predicting other players’ reactions. So as to form a model that may act seamlessly with the world, it desires the same ability to grasp the