Phase-Based Binocular Perception of Motion in Depth: Cortical-Like Operators and Analog VLSI Architectures

  • PDF / 1,127,588 Bytes
  • 13 Pages / 600 x 792 pts Page_size
  • 109 Downloads / 179 Views

DOWNLOAD

REPORT


Phase-Based Binocular Perception of Motion in Depth: Cortical-Like Operators and Analog VLSI Architectures Silvio P. Sabatini Department of Biophysical and Electronic Engineering, University of Genoa, Via All’ Opera Pia 11a, 16145 Genova, Italy Email: [email protected]

Fabio Solari Department of Biophysical and Electronic Engineering, University of Genoa, Via All’ Opera Pia 11a, 16145 Genova, Italy Email: [email protected]

Paolo Cavalleri Department of Biophysical and Electronic Engineering, University of Genoa, Via All’ Opera Pia 11a, 16145 Genova, Italy Email: [email protected]

Giacomo Mario Bisio Department of Biophysical and Electronic Engineering, University of Genoa, Via All’ Opera Pia 11a, 16145 Genova, Italy Email: [email protected] Received 30 April 2002 and in revised form 7 January 2003 We present a cortical-like strategy to obtain reliable estimates of the motions of objects in a scene toward/away from the observer (motion in depth), from local measurements of binocular parameters derived from direct comparison of the results of monocular spatiotemporal filtering operations performed on stereo image pairs. This approach is suitable for a hardware implementation, in which such parameters can be gained via a feedforward computation (i.e., collection, comparison, and punctual operations) on the outputs of the nodes of recurrent VLSI lattice networks, performing local computations. These networks act as efficient computational structures for embedded analog filtering operations in smart vision sensors. Extensive simulations on both synthetic and real-world image sequences prove the validity of the approach that allows to gain high-level information about the 3D structure of the scene, directly from sensorial data, without resorting to explicit scene reconstruction. Keywords and phrases: cortical architectures, phase-based dynamic stereoscopy, motion processing, Gabor filters, lattice networks.

1.

INTRODUCTION

In many real-world visual application domains it is important to extract dynamic 3D visual information from 2D images impinging the retinas. One of this kind of problems concerns the perception of motion in depth (MID), that is, the capability of discriminating between forward and backward movements of objects from an observer has important implications for autonomous robot navigation and surveillance in dynamic environments. In general, the solutions to these problems rely on a global analysis of the optic flow or on token matching techniques which combine stereo correspondence and visual tracking. Interpreting 3D motion estimation as a reconstruction problem [1], the goal of these approaches is to obtain from a monocular/binocular image se-

quence the relative 3D motion to every scene component as well as a relative depth map of the environment. These solutions suffer under instability and require a very large computational effort which precludes a real-time reactive behaviour unless one uses data parallel computers to deal with the large amount of symbolic information present in