DAPs: Deep Action Proposals for Action Understanding

Object proposals have contributed significantly to recent advances in object understanding in images. Inspired by the success of this approach, we introduce Deep Action Proposals (DAPs), an effective and efficient algorithm for generating temporal action

PDF / 3,586,185 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
48 Downloads / 251 Views

DOWNLOAD

REPORT

King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia {victor.escorcia,fabian.caba,bernard.ghanem}@kaust.edu.sa 2 Stanford University, Stanford, USA [email protected] 3 Universidad del Norte, Barranquilla, Colombia

Abstract. Object proposals have contributed signiﬁcantly to recent advances in object understanding in images. Inspired by the success of this approach, we introduce Deep Action Proposals (DAPs), an eﬀective and eﬃcient algorithm for generating temporal action proposals from long videos. We show how to take advantage of the vast capacity of deep learning models and memory cells to retrieve from untrimmed videos temporal segments, which are likely to contain actions. A comprehensive evaluation indicates that our approach outperforms previous work on a large scale action benchmark, runs at 134 FPS making it practical for large-scale scenarios, and exhibits an appealing ability to generalize, i.e. to retrieve good quality temporal proposals of actions unseen in training.

Keywords: Action proposals memory

1

·

Action detection

·

Long-short term

Introduction

Nowadays, the ubiquity of digital cameras and social networks has increased the amount of visual media content (especially videos) generated and shared by people. In the face of this data deluge, it becomes crucial to develop eﬃcient and scalable algorithms that can intelligently parse/browse visual data to discover semantic information. In this paper, we focus on the task of quickly localizing temporal chunks in untrimmed videos that are likely to contain human activities of interest. This is the well-known task of temporal action proposal generation. The detected temporal proposals can facilitate and speedup activity detection, indexing, and retrieval in long videos. For example, a “good” action proposal method can retrieve video snippets of a home-run being scored within a large Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46487-9 47) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part III, LNCS 9907, pp. 768–784, 2016. DOI: 10.1007/978-3-319-46487-9 47

DAPs: Deep Action Proposals for Action Understanding

769

Fig. 1. An eﬀective and eﬃcient action proposal algorithm can localize segments of varied duration around actions occurring along a video without exhaustively exploring multiple temporal scales. This work shows how to produce high-quality temporal proposals likely to contain actions and to be 10x faster that the state of the art approach.

corpus of baseball games or extract important moments during the construction of a new skyscraper. Motivated by the large-scale nature of the problem, we develop a temporal proposal algorithm that retrieves high ﬁdelity proposals with a much smaller computational cost than previous methods (refer to Fig. 1). The idea of extracting regions with semantic content is not new in the computer vision community. Object pro

Data Loading...

DAPs: Deep Action Proposals for Action Understanding

Recommend Documents

Action Understanding

Spot On: Action Localization from Pointly-Supervised Proposals

Motion History Images for Action Recognition and Understanding

Human Action Detection Using Deep Learning

Human Action Detection Using Deep Learning Techniques

CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization

Call-for-Action

Action Cancer

Virtual action

ACTION BUCKET

Affirmative Action

Action Possibility