Automatic Video Object Segmentation Using Volume Growing and Hierarchical Clustering

  • PDF / 2,637,159 Bytes
  • 19 Pages / 600 x 792 pts Page_size
  • 51 Downloads / 226 Views

DOWNLOAD

REPORT


Automatic Video Object Segmentation Using Volume Growing and Hierarchical Clustering Fatih Porikli Mitsubishi Electric Research Laboratories, Cambridge, MA 02139, USA Email: [email protected]

Yao Wang Department of Electrical Engineering, Polytechnic University, Brooklyn, NY 11201, USA Email: [email protected] Received 4 February 2003; Revised 26 December 2003 We introduce an automatic segmentation framework that blends the advantages of color-, texture-, shape-, and motion-based segmentation methods in a computationally feasible way. A spatiotemporal data structure is first constructed for each group of video frames, in which each pixel is assigned a feature vector based on low-level visual information. Then, the smallest homogeneous components, so-called volumes, are expanded from selected marker points using an adaptive, three-dimensional, centroid-linkage method. Self descriptors that characterize each volume and relational descriptors that capture the mutual properties between pairs of volumes are determined by evaluating the boundary, trajectory, and motion of the volumes. These descriptors are used to measure the similarity between volumes based on which volumes are further grouped into objects. A fine-to-coarse clustering algorithm yields a multiresolution object tree representation as an output of the segmentation. Keywords and phrases: video segmentation, object detection, centroid linkage, color similarity.

1.

INTRODUCTION

Object segmentation is important for video compression standards as well as recognition, event analysis, understanding, and video manipulation. By object we refer to a collection of image regions grouped under some homogeneity criteria where a region is defined as a contiguous set of pixels. Basically, segmentation techniques can be grouped into three classes: region-based methods using a homogeneous color or texture criterion, motion-based approaches utilizing a homogeneous motion criterion, and object tracking. Approaches in the region-oriented domain range from empirical evaluation of various color spaces [1], to clustering in feature space [2], to nearest-neighbor algorithm, to pyramid linking [3], to morphological methods [4], to split-andmerge [5], to hierarchical clustering [6]. Color-clusteringbased methods often utilize histograms and they are computationally simple. Histogram analysis delivers satisfactory segmentation result especially for multimodal color distributions, and where the input data set is relatively simple, clean, and fits the model well. However, this method lacks generality and robustness. Besides, histogram methods fail to establish spatial connectivity. Region-growing-based techniques provide better performance in terms of spatial connectiv-

ity and boundary accuracy than histogram-based methods. However, extracted regions may not correspond to actual physical objects unless the intensity or color of each pixel in objects diļ¬€ers from the background. A common problem of histogram and region-based methods arises from the fact that a video object can contain