Skeleton-Based Human Action Recognition with Profile Hidden Markov Models

Recognizing human actions from image sequences is an active area of research in computer vision. In this paper, a novel HMM-based approach is proposed for human action recognition using 3D positions of body joints. First, actions are segmented into meanin

  • PDF / 916,346 Bytes
  • 10 Pages / 439.37 x 666.142 pts Page_size
  • 102 Downloads / 219 Views

DOWNLOAD

REPORT


School of Computer Science and Technology, Xidian University, Xi’an, China School of Mathematical Sciences, Huaibei Normal University, Anhui, China {dww2048,chengfei8582}@163.com, [email protected], [email protected], [email protected]

Abstract. Recognizing human actions from image sequences is an active area of research in computer vision. In this paper, a novel HMMbased approach is proposed for human action recognition using 3D positions of body joints. First, actions are segmented into meaningful action units called dynamic instants and intervals by using motion velocities, the direction of motion, and the curvatures of 3D trajectories. Then action unit with its spatio-temporal feature sets are clustered using unsupervised learning, like SOM, to generate a sequence of discrete symbols. To overcome an abrupt change or an abnormal in its gesticulation between different performances of the same action, Profile Hidden Markov Models (Profile HMMs) are applied with these symbol sequences using Viterbi and Baum-Welch algorithms for human activity recognition. The experimental evaluations show that the proposed approach achieves promising results compared to other state of the art algorithms. Keywords: View-invariant representation · Skeleton joints activity recognition · Profile HMM · Self-organizing map

1

·

Human

Introduction

Recognizing human activity is a key component in many applications, such as Video Surveillance, Ambient Intelligence, Human-Computer Interaction systems, and even Health-Care. Despite remarkable research efforts and many encouraging advances in the past decade, accurate recognition of the human actions is still a quite challenging task. Many recent state-of-the-art techniques for human action recognition rely on: Bag-of-Word (BoW) [1] representations extracted from Spatio-Temporal Interest Points (STIP) [2], Dynamic Time Warping (DTW)[3] algorithm derived from exemplar-based approaches, Eigenjoints [4] stem from skeleton-based approaches, etc. Despite these good results were achieved by state of the art activity recognition approaches, these still have some limitations. To address these issues and enhance human action recognition performance, time-sequential representation is more appropriate for these problem. Frame by c Springer-Verlag Berlin Heidelberg 2015  H. Zha et al. (Eds.): CCCV 2015, Part I, CCIS 546, pp. 12–21, 2015. DOI: 10.1007/978-3-662-48558-3 2

Action Units (Postures and Actionlets) Extraction

Labeling for Action Units

Test Sequnce

DgEvEpEtEwD

Testing



Segmentation Points



Dynamic Instants (Postures)



Intervals (Actionlets)



SOM clustering



Davies-Bouldin Index

Action Units Sequence Repository

Training

Profile Hidden Markov Models (Profile HMMs)

Skeleton-Based Human Action Recognition

13

Action Classification

Aligned Sequences

Fig. 1. The general framework of the proposed approach.

frame representations suffer from redundancy. Therefore segmenting video into states and handling unaligned video sequences are two main problems. In