Motion Segmentation and Retrieval for 3D Video Based on Modified Shape Distribution

  • PDF / 2,812,338 Bytes
  • 11 Pages / 600.03 x 792 pts Page_size
  • 73 Downloads / 195 Views

DOWNLOAD

REPORT


Research Article Motion Segmentation and Retrieval for 3D Video Based on Modified Shape Distribution Toshihiko Yamasaki and Kiyoharu Aizawa Department of Information and Communication Engineering, Graduate School of Information Science and Technology, The University of Tokyo, Engineering Building No. 2, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan Received 31 January 2006; Accepted 14 October 2006 Recommended by Tsuhan Chen A similar motion search and retrieval system for 3D video are presented based on a modified shape distribution algorithm. 3D video is a sequence of 3D models made for a real-world object. In the present work, three fundamental functions for efficient retrieval have been developed: feature extraction, motion segmentation, and similarity evaluation. Stable-shape feature representation of 3D models has been realized by a modified shape distribution algorithm. Motion segmentation has been conducted by analyzing the degree of motion using the extracted feature vectors. Then, similar motion retrieval has been achieved employing the dynamic programming algorithm in the feature vector space. The experimental results using 3D video sequences of dances have demonstrated very promising results for motion segmentation and retrieval. Copyright © 2007 T. Yamasaki and K. Aizawa. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1.

INTRODUCTION

Dynamic three-dimensional (3D) modeling of real-world objects using multiple cameras has been an active research area in recent years [1–5]. Since such sequential 3D models, which we call 3D video, are generated employing a lot of cameras and represented as 3D polygon mesh, realistic representation of dynamic 3D objects is obtained. Namely, the objects’ appearance such as shape and color and their temporal change are captured in 3D video. Therefore, they are different from conventional 3D computer graphics and 3D motion capture data. Similar to 2D video, 3D video consists of consecutive sequences of 3D models (frames). Each frame contains three kinds of data such as coordinates of vertices, connection, and color. So far, researches of 3D video have been mainly focused on its acquisition methods, and they are in their infancy. Therefore, most of the research topics in 3D video were capture systems [1–5] and compression [6, 7]. As the amount of 3D video data increases, the development of efficient and effective segmentation and retrieval systems is being desired for managing the database. Related works can be found in so-called 3D “motion capture” data aiming at motion segmentation [8–12] and retrieval [13–15]. This is because structural features such as

motion of joints and other feature points are easily located and tracked in motion capture data. For motion segmentation, Shiratori et al. analyzed local minima in motion [8]. The idea of searching local minima in kinematic parameters was also employed in [9]. S