VOCUS: A Visual Attention System for Object Detection and Goal-Directed Search

Object detection and recognition is a topic of significant interest in computer and robot vision. It is required in most applications of computational vision, for example, biometric systems, medical imaging, intelligent cars, factory automation, and image

  • PDF / 6,436,841 Bytes
  • 219 Pages / 430 x 660 pts Page_size
  • 21 Downloads / 187 Views

DOWNLOAD

REPORT


Subseries of Lecture Notes in Computer Science

3899

Simone Frintrop

VOCUS: A Visual Attention System for Object Detection and Goal-Directed Search

13

Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA Jörg Siekmann, University of Saarland, Saarbrücken, Germany Author Simone Frintrop Kungliga Tekniska Högskolan (KTH) Computer Science and Communication (CSC) Computational Vision and Active Perception Laboratory (CVAP) 10044 Stockholm, Sweden E-mail: [email protected], [email protected]

This work was carried out at Fraunhofer Institute for Autonomous Intelligent Systems (AIS) St. Augustin, Germany and accepted as PhD thesis at the University of Bonn, Germany

Library of Congress Control Number: 2006921341

CR Subject Classification (1998): I.2.10, I.2.6, I.4, I.5, F.2.2 LNCS Sublibrary: SL 7 – Artificial Intelligence ISSN ISBN-10 ISBN-13

0302-9743 3-540-32759-2 Springer Berlin Heidelberg New York 978-3-540-32759-2 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2006 Printed in Germany Typesetting: Camera-ready by author, data conversion by Boller Mediendesign Printed on acid-free paper SPIN: 11682110 06/3142 543210

Foreword

In humans, more than 30% of the brain is devoted to visual processing to allow us to interpret and behave intelligently as part of our daily lives. Vision is by far one of the most versatile and important sensory modalities for our interaction with the surrounding world. Consequently, it is not surprising that there is a considerable interest in endowing artificial systems with similar capabilities. Computational vision for embodied cognitive agents offers important competencies in terms of navigating in everyday environments, recognition of objects for interaction and interpretation of human actions as part of cooperative interaction. One problem in terms of use of vision is computational complexity. It is well known that tasks such as search and recognition in principle might have NP complexity. At the same time, for use of vision in natural environments there is a need to operate in real-time, and thus to bound computational complexity to ensure timely response. The study of visual attention is very much the design of control mechanisms to limit complexity. Using a rather coarse classification one might divide visual processing into data- and model/goal-driven processing. In data-drive