Functional Object Class Detection Based on Learned Affordance Cues

Current approaches to visual object class detection mainly focus on the recognition of basic level categories, such as cars, motorbikes, mugs and bottles. Although these approaches have demonstrated impressive performance in terms of recognition, their re

PDF / 2,341,171 Bytes
10 Pages / 430 x 660 pts Page_size
62 Downloads / 231 Views

DOWNLOAD

REPORT

2

Computer Science Department, TU Darmstadt, Germany {stark,lies,schiele}@informatik.tu-darmstadt.de School of Computer Science, University of Birmingham, United Kingdom {mxz,jlw}@cs.bham.ac.uk

Abstract. Current approaches to visual object class detection mainly focus on the recognition of basic level categories, such as cars, motorbikes, mugs and bottles. Although these approaches have demonstrated impressive performance in terms of recognition, their restriction to these categories seems inadequate in the context of embodied, cognitive agents. Here, distinguishing objects according to functional aspects based on object aﬀordances is important in order to enable manipulation of and interaction between physical objects and cognitive agent. In this paper, we propose a system for the detection of functional object classes, based on a representation of visually distinct hints on object aﬀordances (aﬀordance cues). It spans the complete range from tutordriven acquisition of aﬀordance cues, learning of corresponding object models, and detecting novel instances of functional object classes in real images. Keywords: Functional object categories, object aﬀordances, object category detection, object recognition.

1

Introduction and Related Work

In recent years, computer vision has made tremendous progress in the ﬁeld of object category detection. Diverse approaches based on local features, such as simple bag-of-words methods [2] have shown impressive results for the detection of a variety of diﬀerent objects. More recently, adding spatial information has resulted in a boost in performance [10], and combining diﬀerent cues has even further pushed the limits. One of the driving forces behind object category detection is a widely-adopted collection of publicly available data sets [3,7], which is considered an important instrument for measuring and comparing the detection performance of diﬀerent methods. The basis for comparison is given by a set of rather abstract, basic level categories [15]. These categories are grounded in cognitive psychology, and category instances typically share characteristic visual properties. In the context of embodied cognitive agents, however, diﬀerent criteria for the formation of categories seem more appropriate. Ideally, an embodied, cognitive A. Gasteratos, M. Vincze, and J.K. Tsotsos (Eds.): ICVS 2008, LNCS 5008, pp. 435–444, 2008. c Springer-Verlag Berlin Heidelberg 2008

436

M. Stark et al.

Fig. 1. Basic level (left) vs functional (right) object categories

agent (an autonomous robot, e.g.), would be capable of categorizing and detecting objects according to potential uses, and w.r.t. their utility in performing a certain task. This functional deﬁnition of object categories is related to the notion of aﬀordances pioneered by [6]. Fig. 1 exempliﬁes the diﬀerentiation between functional and basic level categories, and highlights the following two key properties: 1) functional categories may generalize across and beyond basic level categories (both a mug and a watering-can are handle-gra

Data Loading...

Functional Object Class Detection Based on Learned Affordance Cues

Recommend Documents

Object affordance detection with relationship-aware network

Amplifying Key Cues for Human-Object-Interaction Detection

Moving object detection based on unified model

GeoGraph: Graph-Based Multi-view Object Detection with Geometric Cues End-to-End

Vehicle Detection and Speed Tracking Based on Object Detection

Segmentation-Based Salient Object Detection

Multi-class Multi-object Tracking Using Changing Point Detection

Affordance Lost, Affordance Regained, and Affordance Surrendered

Object Detection Based on Sparse Representation of Foreground

Carried Object Detection Based on an Ensemble of Contour Exemplars

Affordance-Based Human-Robot Interaction

Pillar-Based Object Detection for Autonomous Driving