Layer-based sparse representation of multiview images

  • PDF / 2,017,886 Bytes
  • 15 Pages / 595.28 x 793.7 pts Page_size
  • 4 Downloads / 235 Views

DOWNLOAD

REPORT


RESEARCH

Open Access

Layer-based sparse representation of multiview images Andriy Gelman1*, Jesse Berent2 and Pier Luigi Dragotti1

Abstact This article presents a novel method to obtain a sparse representation of multiview images. The method is based on the fact that multiview data is composed of epipolar-plane image lines which are highly redundant. We extend this principle to obtain the layer-based representation, which partitions a multiview image dataset into redundant regions (which we call layers) each related to a constant depth in the observed scene. The layers are extracted using a general segmentation framework which takes into account the camera setup and occlusion constraints. To obtain a sparse representation, the extracted layers are further decomposed using a multidimensional discrete wavelet transform (DWT), first across the view domain followed by a two-dimensional (2D) DWT applied to the image dimensions. We modify the viewpoint DWT to take into account occlusions and scene depth variations. Simulation results based on nonlinear approximation show that the sparsity of our representation is superior to the multi-dimensional DWT without disparity compensation. In addition we demonstrate that the constant depth model of the representation can be used to synthesise novel viewpoints for immersive viewing applications and also de-noise multiview images. 1 Introduction The notion of sparsity, namely the idea that the essential information contained in a signal can be represented with a small number of significant components, is widespread in signal processing and data analysis in general. Sparse signal representations are at the heart of many successful signal processing applications, such as signal compression and de-noising. In the case of images, successful new representations have been developed on the assumption that the data is well modelled by smooth regions separated by edges or regular contours. Besides wavelets, which have been successful for image compression [1], other examples of dictionaries that provide sparse image representations are curvelets [2], contourlets [3], ridgelets [4], directionlets [5], bandlets [6,7] and complex wavelets [8,9]. We refer the reader to a recent overview article [10] for a more comprehensive review on the theory of sparse signal representation. In parallel and somewhat independently to these developments, there has been a growing interest in the capture and processing of multiview images. The * Correspondence: [email protected] 1 Communications and Signal Processing Group, Imperial College London, London SW7 2AZ, UK Full list of author information is available at the end of the article

popularity of this approach has been driven by the advent of novel exciting applications such as immersive communication [11] or free-viewpoint and three-dimensional (3D) TV [12]. At the heart of these applications is the idea that a novel arbitrary photorealistic view of a real scene can be obtained by proper interpolation of existing views. The problem of synthes