Object affordance detection with relationship-aware network

PDF / 1,736,500 Bytes
13 Pages / 595.276 x 790.866 pts Page_size
87 Downloads / 269 Views

(0123456789().,-volV)(0123456789(). ,- volV)

EXTREME LEARNING MACHINE AND DEEP LEARNING NETWORKS

Object affordance detection with relationship-aware network Xue Zhao1

•

Yang Cao1

•

Yu Kang1

Received: 30 November 2018 / Accepted: 28 June 2019 Springer-Verlag London Ltd., part of Springer Nature 2019

Abstract Object affordance detection, which aims to understand functional attributes of objects, is of great significance for an autonomous robot to achieve a humanoid object manipulation. In this paper, we propose a novel relationship-aware convolutional neural network, which takes the symbiotic relationship between multiple affordances and the combinational relationship between the affordance and objectness into consideration, to predict the most probable affordance label for each pixel in the object. Different from the existing CNN-based methods that rely on separate and intermediate object detection step, our proposed network directly produces the pixel-wise affordance maps from an input image in an end-toend manner. Specifically, there are three key components in our proposed network: Coord-ASPP module introducing CoordConv in atrous spatial pyramid pooling (ASPP) to refine the feature maps, relationship-aware module linking the affordances and corresponding objects to explore the relationships, and online sequential extreme learning machine auxiliary attention module focusing on individual affordances further to assist relationship-aware module. The experimental results on two public datasets have shown the merits of each module and demonstrated the superiority of our relationship-aware network against the state of the arts. Keywords Object affordance detection Convolutional neural network Relationship-aware Online sequential extreme learning machine

1 Introduction Affordances or functional attributes of objects are defined as the latent ‘‘action possibilities’’ available to an agent, given their capabilities and the environment by Gibson [1]. In this sense, a hammer, for example, usually has two different affordances: one affords pounding and the other affords grasping. For humans, while interacting with the real world, we focus on understanding different functions of objects to fulfill a certain action. Similarly, for an autonomous robot collaborating with humans, it is of great significance to understand object affordances to achieve a humanlike object manipulation. Imagine that we let a robot to use a hammer to pound something. What computer vision has achieved now allows the robot to recognize the hammer and localize it very accurately. However, to

& Yang Cao [email protected] 1

Department of Automation, University of Science and Technology of China, Hefei, China

further finish the specific task, the robot needs to know which part of the hammer can be grasped and which part can be used to pound. The problem of perceiving affordances at pixel level has been termed ‘‘object part labelling’’ in the computer vision community, while it is more commonly known as ‘‘affordance detection’’ in robo

Data Loading...

Object affordance detection with relationship-aware network

Recommend Documents

Functional Object Class Detection Based on Learned Affordance Cues

Dual Refinement Underwater Object Detection Network

Affordance Lost, Affordance Regained, and Affordance Surrendered

Polysemy Deciphering Network for Human-Object Interaction Detection

Affordance

Object Detection with Convolutional Neural Networks

Single shot object detection with refined feature

Cross-Modal Weighting Network for RGB-D Salient Object Detection

Contextual Heterogeneous Graph Network for Human-Object Interaction Detection

Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection

TENet: Triple Excitation Network for Video Salient Object Detection

Corner Proposal Network for Anchor-Free, Two-Stage Object Detection