Sliding Adjustment for 3D Video Representation

  • PDF / 4,470,518 Bytes
  • 14 Pages / 612 x 792 pts (letter) Page_size
  • 10 Downloads / 190 Views

DOWNLOAD

REPORT


Sliding Adjustment for 3D Video Representation Franck Galpin IRISA/INRIA Rennes, Universit´e de Rennes 1, Campus de Beaulieu, 35042 Rennes C´edex, France Email: [email protected]

Luce Morin IRISA/INRIA Rennes, Universit´e de Rennes 1, Campus de Beaulieu, 35042 Rennes C´edex, France Email: [email protected] Received 31 August 2001 and in revised form 15 February 2002 This paper deals with video coding of static scenes viewed by a moving camera. We propose an automatic way to encode such video sequences using several 3D models. Contrary to prior art in model-based coding where 3D models have to be known, the 3D models are automatically computed from the original video sequence. We show that several independent 3D models provide the same functionalities as one single 3D model, and avoid some drawbacks of the previous approaches. To achieve this goal we propose a novel algorithm of sliding adjustment, which ensures consistency of successive 3D models. The paper presents a method to automatically extract the set of 3D models and associate camera positions. The obtained representation can be used for reconstructing the original sequence, or virtual ones. It also enables 3D functionalities such as synthetic object insertion, lightning modification, or stereoscopic visualization. Results on real video sequences are presented. Keywords and phrases: sliding adjustment, 3D model reconstruction, video coding, model-based coding, video manipulation.

1. INTRODUCTION More and more new coding techniques include high-level information in video sequence representation. This information aims to provide high-level functionalities such as interactivity, video content description, video manipulation, or stereo visualization. For instance, the QuickTime-VR format provides the functionality of interactive visualization of a real static scene, by representing it as a panoramic image [1]. MPEG4 standard describes the video scene content as a set of plane objects called video object plane (VOP) [2], which can be interactively moved or combined during visualization. A panoramic representation of static backgrounds is also proposed in MPEG4 with the Sprite format. Such 2D representations do not give information on the 3D structure of the scene, and are therefore limited for video manipulation. Panoramic images provide only limited interactivity: zoom and view orientation can be changed but the view-point is fixed. With 2D representations, video manipulation such as hybrid synthetic-real video mixing, involving occlusions, shadows, lightning modification are not feasible in a realistic way. These functionalities require 3D information on the scene. 3D model-based representations for real video sequences have been studied for a long time, since they have very attractive properties. Apart from the functionalities that they

provide, they enable very low bit rates and scalable/progressive coding [3]. 3D model-based representations can be classified into explicit and implicit representations. Within the explicit representations, we can disti