Attentional Mechanisms for Interactive Image Exploration

  • PDF / 800,724 Bytes
  • 6 Pages / 600 x 792 pts Page_size
  • 76 Downloads / 176 Views

DOWNLOAD

REPORT


Attentional Mechanisms for Interactive Image Exploration Joseph Machrouh Situated Perception Group, Human-Machine Communication Department, LIMSI-CNRS, BP 133, 91403 Orsay, France Email: [email protected] France Telecom Research & Development, 2 Pierre Marzin Avenue, BP 50702, 22307 Lannion Cedex, France Email: [email protected]

Philippe Tarroux Situated Perception Group, Human-Machine Communication Department, LIMSI-CNRS, BP 133, 91403 Orsay, France ´ Ecole Normale Sup´erieure, 45 rue d’Ulm, 75230 Paris Cedex 05, France Email: [email protected] Received 31 December 2003; Revised 15 December 2004 A lot of work has been devoted to content-based image retrieval from large image databases. The traditional approaches are based on the analysis of the whole image content both in terms of low-level and semantic characteristics. We investigate in this paper an approach based on attentional mechanisms and active vision. We describe a visual architecture that combines bottom-up and topdown approaches for identifying regions of interest according to a given goal. We show that a coarse description of the searched target combined with a bottom-up saliency map provides an efficient way to find specified targets on images. The proposed system is a first step towards the development of software agents able to search for image content in image databases. Keywords and phrases: exploratory vision, bottom-up exploration, top-down exploration, attention, situated vision.

1.

INTRODUCTION

Image analysis is confronted with the development of large image databases and new techniques have to be designed for image and content retrieving in this context. The agent paradigm has proved its efficiency for searching in unstructured databases. An agent exhibits interaction abilities with its environment and an autonomous behavior driven by its perceptions of the environment and its expectancies. This viewpoint emphasizes the role of interaction in visual processing and is related to the active vision paradigm mainly used in robotics [1, 2]. We propose here to use a similar paradigm of active vision for implementing content retrieval mechanisms in fixed image or video sequences. To drive the active vision system, we need a mechanism for identifying salient regions in the visual scene. Most of the systems proposed for the computation of saliency maps are based on bottom-up approaches [3, 4]. We use here a bottom-up mechanism to identify a first set of salient regions and a This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

top-down mechanism for target recognition. Salient regions can be defined as high-energy contrast regions. On the other hand, regions of interest are characterized by their high relevance according to a given goal. Preattentional mechanisms are based on saliencies while attentional top-down processes are goal-directed. We thus propose an