A Review of Dynamic Maps for 3D Human Motion Recognition Using ConvNets and Its Improvement

PDF / 1,296,632 Bytes
15 Pages / 439.37 x 666.142 pts Page_size
66 Downloads / 225 Views

A Review of Dynamic Maps for 3D Human Motion Recognition Using ConvNets and Its Improvement Zhimin Gao1 · Pichao Wang2

· Huogen Wang3 · Mingliang Xu1 · Wanqing Li4

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract RGB-D based action recognition is attracting more and more attention in both the research and industrial communities. However, due to the lack of training data, pre-training based methods are popular in this field. This paper presents a review of the concept of dynamic maps for RGB-D based human motion recognition using pretrained models in image domain. The dynamic maps recursively encode the spatial, temporal and structural information contained in the video sequence into dynamic motion images simultaneously. They enable the usage of Convolutional Neural Network and its pretained models on ImageNet for 3D human motion recognition. This simple, compact and effective representation achieves state-of-theart results on various gesture/action/activities recognition datasets. Based on the review of previous methods using this concept upon different modalities (depth, skeleton or RGBD data), a novel encoding scheme is developed and presented in this paper. The improved method generates effective flow-guided dynamic maps, and they could select the high motion window and distinguish the order among the frames with small motion. The improved flowguided dynamic maps achieve state-of-the-art results on the large Chalearn LAP IsoGD and NTU RGB+D datasets. Keywords Dynamic maps · 3D human motion recognition · ConvNets

B

Huogen Wang [email protected] Zhimin Gao [email protected] Pichao Wang [email protected] Mingliang Xu [email protected] Wanqing Li [email protected]

1

School of Information Engineering, Zhengzhou University, Zhengzhou, China

2

Alibaba Group (U.S.) Inc., Bellevue, WA, USA

3

School of Electrical and Information Engineer, Tianjin University, Tianjin, China

4

Advanced Multimedia Research Lab, University of Wollongong, Wollongong, Australia

123

Z. Gao et al.

1 Introduction RGB-D (Red, Green, Blue and Depth) based human action recognition, has attracted increasing attention, due to the availability of RGB-D video cameras and the advantages they offering over conventional RGB video. For example, the additional depth information provides insensitivity to illumination changes and allows a more reliable estimation of body silhouette and skeleton Shotton et al. [21]. However, it remains unclear how such video could be compactly and effectively represented and used for computer vision tasks including classification and recognition. A number of works Yang and Tian [41]; Xia et al. [39]; Wang et al. [32]; Vemulapalli et al. [26]; Wang et al. [31]; Yang et al. [43]; Oreifej and Liu [18]; Yang and Tian [42]; Lu et al. [17]; Liu et al. [16] have appeared in the literature following the lead of the earliest work Li et al. [13] that used RGB-D data for human action recognition. It is interesting to note that the methods proposed in these works are based on

Data Loading...

A Review of Dynamic Maps for 3D Human Motion Recognition Using ConvNets and Its Improvement

Recommend Documents

3D Dynamic Hand Gestures Recognition Using the Leap Motion Sensor and Convolutional Neural Networks

A human motion model based on maps for navigation systems

Marker-Less 3D Human Motion Capture with Monocular Image Sequence and Height-Maps

Human Motion Sensing and Recognition A Fuzzy Qualitative Approach

Dynamic Hand Gesture Recognition Using 3D-Convolutional Neural Network

Human action recognition based on 3D body mask and depth spatial-temporal maps

Markerless 3D Human Motion Capture from Images

3D Robust Online Motion Planning for Steerable Needles in Dynamic Workspaces Using Duty-Cycled Rotation

3D Printer Nozzle Design and Its Parameters: A Systematic Review

Monocular 3D Exploration using Lines-of-Sight and Local Maps

3D human pose estimation model using location-maps for distorted and disconnected images by a wearable omnidirectional c

Gait Recognition, Motion Analysis for