Optimizing Training Set Construction for Video Semantic Classification

PDF / 2,243,524 Bytes
10 Pages / 600.05 x 792 pts Page_size
101 Downloads / 197 Views

Research Article Optimizing Training Set Construction for Video Semantic Classification Jinhui Tang,1 Xian-Sheng Hua,2 Yan Song,1 Tao Mei,2 and Xiuqing Wu1 1 Department 2 Microsoft

of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei 230027, China Research Asia, Beijing 100080, China

Correspondence should be addressed to Jinhui Tang, [email protected] Received 9 March 2007; Revised 14 September 2007; Accepted 12 November 2007 Recommended by Mark Kahrs We exploit the criteria to optimize training set construction for the large-scale video semantic classification. Due to the large gap between low-level features and higher-level semantics, as well as the high diversity of video data, it is diﬃcult to represent the prototypes of semantic concepts by a training set of limited size. In video semantic classification, most of the learning-based approaches require a large training set to achieve good generalization capacity, in which large amounts of labor-intensive manual labeling are ineluctable. However, it is observed that the generalization capacity of a classifier highly depends on the geometrical distribution of the training data rather than the size. We argue that a training set which includes most temporal and spatial distribution information of the whole data will achieve a good performance even if the size of training set is limited. In order to capture the geometrical distribution characteristics of a given video collection, we propose four metrics for constructing/selecting an optimal training set, including salience, temporal dispersiveness, spatial dispersiveness, and diversity. Furthermore, based on these metrics, we propose a set of optimization rules to capture the most distribution information of the whole data using a training set with a given size. Experimental results demonstrate these rules are eﬀective for training set construction in video semantic classification, and significantly outperform random training set selection. Copyright © 2008 Jinhui Tang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1.

INTRODUCTION

Video content analysis is an elementary step for mining the semantic information in video collections, in which semantic classification (or we may call it annotation) of video segments is essential for further analysis, as well as important for enabling semantic-level video search. For human being, most semantic concepts are clear and easy to identify, while due to the large gap between semantics and low-level features, the corresponding features generally are not well-separated in feature space thus diﬃcult to be identified by computer. This is an open diﬃculty in computer vision and visual content analysis area. Generally, learning-based video semantic classification methods use statistical learning algorithms to model the semantic concepts (generative lea

Data Loading...

Optimizing Training Set Construction for Video Semantic Classification

Recommend Documents

Expanding Training Set for Graph-Based Semi-supervised Classification

Semantic Analysis of Video

Clockwork Convnets for Video Semantic Segmentation

Optimizing Language Models for Polarity Classification

Semantic Segmentation Datasets for Resource Constrained Training

Modified Soft Rough set for Multiclass Classification

Weakly supervised multilabel classification for semantic interpretation of endoscopy video frames

Scalable Classification Tree Construction

Optimizing Layout of Video Surveillance for Substation Monitoring

Research on Construction of Smart Training Room Based on Mobile Cloud Video Surveillance Technologies

Recurrent Temporal Deep Field for Semantic Video Labeling

Adaptive Attention Mechanism Based Semantic Compositional Network for Video Captioning