A New Dataset and Evaluation for Infrared Action Recognition
Action recognition (AR) is one of the most important tasks in computer vision and there are a large number of related research works along this line. While most of these works are investigated on AR datasets collected from the visible spectrum, the AR pro
- PDF / 1,397,375 Bytes
- 11 Pages / 439.37 x 666.142 pts Page_size
- 32 Downloads / 216 Views
Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications, Chongqing, China [email protected] 2 Institute for Information and System Sciences and Ministry of Education Key Lab of Intelligent Networks and Network Security, Xi’an Jiaotong University, Xi’an, China
Abstract. Action recognition (AR) is one of the most important tasks in computer vision and there are a large number of related research works along this line. While most of these works are investigated on AR datasets collected from the visible spectrum, the AR problem on infrared scenarios still has not attracted much attention, and there is even few public infrared datasets available for supporting this research. This study aims to emphasize the importance of the infrared AR problem in real applications and arouse researchers’ attention on this task. Specifically, we construct a new infrared action dataset and evaluate the state-of-the-art AR pipeline, including widely-used low-level local descriptors, coding methods and fusion strategies, on it. Through these evaluations, we find some interesting results. E.g., dense trajectory feature can achieve the best performance while the appearance features, e.g., HOG, has relatively poorer performance; the coding method of vector of locally aggregated descriptors is evidently better than that of the widely-used fisher vector; the late fusion facilitates a better performance than early fusion. Furthermore, the best performance achieved on our dataset is 70%, leaving a relative large space for promoting new methods on this infrared AR task. Keywords: Infrared action dataset · Action recognition · Local descriptors · Feature fusion
1
Introduction
Action recognition (AR) is one of the most important tasks in computer vision. Its potential applications include video surveillance, video indexing, humancomputer interaction (HCI), etc. [1]. Over the past decades, human action recognition has attracted extensive attention and a number of methods have been proposed to address this task [24]. Basically, most of the efforts have been put into visible imaging videos and many existing methods follow the pipeline: raw feature extraction, feature coding and classifier learning. Generally speaking, the description ability of the adopted features is very important to the performance c Springer-Verlag Berlin Heidelberg 2015 H. Zha et al. (Eds.): CCCV 2015, Part II, CCIS 547, pp. 302–312, 2015. DOI: 10.1007/978-3-662-48570-5 30
A New Dataset and Evaluation for Infrared Action Recognition
303
of the method. So far, many good feature descriptors have been widely used for action recognition, such as STIP [18], HOG3D [14], 3DSIFT [23], etc. The development of feature descriptors needs to be refined and substantiated on proper AR datasets. Recently, many AR datasets have been constructed to research purposes, such as KTH [22], UCF sports [26], HMDB51 [16],WEBinteraction [8], etc. The recently proposed AR datasets [15] more and more simulate real scenarios. While benefited from
Data Loading...