Online Human Action Detection Using Joint Classification-Regression Recurrent Neural Networks

Human action recognition from well-segmented 3D skeleton data has been intensively studied and has been attracting an increasing attention. Online action detection goes one step further and is more challenging, which identifies the action type and localiz

PDF / 1,156,094 Bytes
18 Pages / 439.37 x 666.142 pts Page_size
4 Downloads / 197 Views

DOWNLOAD

REPORT

Institute of Computer Science and Technology, Peking University, Beijing, China {lyttonhao,liujiaying}@pku.edu.cn 2 Microsoft Research Asia, Beijing, China {culan,wezeng}@microsoft.com 3 Institute of Automation, Chinese Academy of Sciences, Beijing, China {jlxing,cfyuan}@nlpr.ia.ac.cn

Abstract. Human action recognition from well-segmented 3D skeleton data has been intensively studied and has been attracting an increasing attention. Online action detection goes one step further and is more challenging, which identiﬁes the action type and localizes the action positions on the ﬂy from the untrimmed stream data. In this paper, we study the problem of online action detection from streaming skeleton data. We propose a multi-task end-to-end Joint Classiﬁcation-Regression Recurrent Neural Network to better explore the action type and temporal localization information. By employing a joint classiﬁcation and regression optimization objective, this network is capable of automatically localizing the start and end points of actions more accurately. Speciﬁcally, by leveraging the merits of the deep Long Short-Term Memory (LSTM) subnetwork, the proposed model automatically captures the complex long-range temporal dynamics, which naturally avoids the typical sliding window design and thus ensures high computational eﬃciency. Furthermore, the subtask of regression optimization provides the ability to forecast the action prior to its occurrence. To evaluate our proposed model, we build a large streaming video dataset with annotations. Experimental results on our dataset and the public G3D dataset both demonstrate very promising performance of our scheme.

Keywords: Action detection classiﬁcation-regression

·

Recurrent neural network

·

Joint

This work was done at Microsoft Research Asia. Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46478-7 13) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part VII, LNCS 9911, pp. 203–220, 2016. DOI: 10.1007/978-3-319-46478-7 13

204

1

Y. Li et al.

Introduction

Human action detection is an important problem in computer vision, which has broad practical applications like visual surveillance, human-computer interaction and intelligent robot navigation. Unlike action recognition and oﬄine action detection, which determine the action after it is fully observed, online action detection aims to detect the action on the ﬂy, as early as possible. It is much desirable to accurately and timely localize the start point and end point of an action along the time and determine the action type as illustrated in Fig. 1. Besides, it is also desirable to forecast the start and end of the actions prior to their occurrence. For example, for intelligent robot system, in addition to the accurate detection of actions, it would also be appreciated if it can predict the start of the impending action or the end of the ongoing actions and then get something ready

Data Loading...

Online Human Action Detection Using Joint Classification-Regression Recurrent Neural Networks

Recommend Documents

Verifying Recurrent Neural Networks Using Invariant Inference

Locally Recurrent Neural Networks

Online Action Detection

Recurrent Neural Networks

Recurrent Neural Networks

Recurrent Neural Networks

Human Action Detection Using Deep Learning

Human Action Detection Using Deep Learning Techniques

Sparse Bayesian Recurrent Neural Networks

Water Leakage Detection Using Neural Networks

Semantic Recommendations of Books Using Recurrent Neural Networks

Static Music Emotion Recognition Using Recurrent Neural Networks