Retrieval by Local Motion
- PDF / 766,595 Bytes
- 7 Pages / 600 x 792 pts Page_size
- 23 Downloads / 238 Views
Retrieval by Local Motion Berna Erol Ricoh California Research Center, 2882 Sand Hill Road, Suite 115, Menlo Park, CA 94025-7022, USA Email: berna [email protected]
Faouzi Kossentini Department of Electrical and Computer Engineering, University of British Columbia, 2356 Main Mall, Vancouver, British Columbia, Canada V6T 1Z4 Email: [email protected] Received 15 May 2002 and in revised form 30 September 2002 Motion feature plays an important role in video retrieval. The current literature mostly addresses motion retrieval only by camera motion and global motion of individual video objects in a video scene. In this paper, we propose two new motion descriptors that capture the local motion of the video object within its bounding box. The proposed descriptors are rotation and scale invariant and based on the angular and circular area variances of the video object and the variances of the angular radial transform coefficients. Experiments show that ranking obtained by querying with our proposed descriptors closely match with the human ranking. Keywords and phrases: video databases, video indexing and retrieval, object-based video, motion descriptor, MPEG-4, MPEG-7.
1.
INTRODUCTION
As the advancements in digital video compression resulted in the availability of large video databases, indexing and retrieval of video became a very active research area. Unlike still images, video has a temporal dimension that we can associate with motion features. We use this information as one of the key components to describe video sequences; for example, “this is the part where we were salsa dancing” or “this video shows my daughter skating for the first time.” Consequently, motion features play an important role in contentbased video retrieval. It is possible to classify the types of video motion features into three groups. (i) Global motion of the video or camera motion (e.g., camera zoom, pan, tilt, roll). (ii) Global motion of the video objects within a frame (e.g., an object is moving from the left to the right of the scene). (iii) Local motion of the video object (e.g., a person is raising his/her arms). Camera operation analysis is generally performed by analyzing the directions of motion vectors that are present in compressed video bit stream [1, 2, 3] or optical flow analysis in the spatial domain [4]. For example, panning and tilting motions are likely to be present if most of the motion vectors inside a frame are in the same direction. Similarly, zooming motion can be identified by determining whether or not
the motion vectors at the top/left of the frame have opposite directions than the motion vectors at the bottom/right of the frame [5, 6]. Global motion of video objects is represented with their motion trajectories, which are formed by tracking the location of video objects (object’s mass center or some selected points on the object) over a sequence of frames. Forming motion trajectories generally requires segmentation of video objects in a video scene. In MPEG-4, the location information of the video object bounding box
Data Loading...