ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

We aim to localize objects in images using image-level supervision only. Previous approaches to this problem mainly focus on discriminative object regions and often fail to locate precise object boundaries. We address this problem by introducing two types

PDF / 13,487,490 Bytes
16 Pages / 439.37 x 666.142 pts Page_size
28 Downloads / 224 Views

DOWNLOAD

REPORT

Abstract. We aim to localize objects in images using image-level supervision only. Previous approaches to this problem mainly focus on discriminative object regions and often fail to locate precise object boundaries. We address this problem by introducing two types of contextaware guidance models, additive and contrastive models, that leverage their surrounding context regions to improve localization. The additive model encourages the predicted object region to be supported by its surrounding context region. The contrastive model encourages the predicted object region to be outstanding from its surrounding context region. Our approach beneﬁts from the recent success of convolutional neural networks for object recognition and extends Fast R-CNN to weakly supervised object localization. Extensive experimental evaluation on the PASCAL VOC 2007 and 2012 benchmarks shows that our context-aware approach signiﬁcantly improves weakly supervised localization and detection. Keywords: Object recognition · Object detection · Weakly supervised object localization · Context · Convolutional neural networks

1

Introduction

Weakly supervised object localization and learning (WSL) [1,2] is the problem of localizing spatial extents of target objects and learning their representations from a dataset with only image-level labels. WSL is motivated by two fundamental issues of conventional object recognition. First, the strong supervision in terms of object bounding boxes or segmentation masks is diﬃcult to obtain and prevents scaling-up object localization to thousands of object classes. Second, imprecise and ambiguous manual annotations can introduce subjective biases to the learning. Convolutional neural networks (CNN) [3,4] have recently taken over the state of the art in many computer vision tasks. CNN-based methods for weakly supervised object localization have been explored in [5,6]. Despite this Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46454-1 22) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part V, LNCS 9909, pp. 350–365, 2016. DOI: 10.1007/978-3-319-46454-1 22

ContextLocNet

351

ROI extraction

context-aware additive model

-

person in ROI but not in context ?

person person in context? in ROI?

+

person?

person person in context? in ROI?

progress, WSL remains a very challenging problem. The state-of-the-art performance of WSL on standard benchmarks [1,2,6] is considerably lower compared to the strongly supervised counterparts [7–9]. Strongly supervised detection methods often use contextual information from regions around the object or from the whole image [7,9–13]: Indeed, visual context often provides useful information about which image regions are likely to be a target class according to object-background or object-object relations, e.g., a boat in the sea, a bird in the sky, a person on a horse, a table around a chair, etc. However, can a similar eﬀect be

Data Loading...

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

Recommend Documents

Inter-Image Communication for Weakly Supervised Localization

Geometry Constrained Weakly Supervised Object Localization

Weakly Supervised Object Localization Using Size Estimates

Discriminative Regions Erasing Strategy for Weakly-Supervised Temporal Action Localization

Rethinking Class Activation Mapping for Weakly Supervised Object Localization

Reliable Saliency Maps for Weakly-Supervised Localization of Disease Patterns

Pairwise Similarity Knowledge Transfer for Weakly Supervised Object Localization

Adversarial Background-Aware Loss for Weakly-Supervised Temporal Activity Localization

Weakly-supervised action localization based on seed superpixels

Weakly and Semi-supervised Deep Level Set Network for Automated Skin Lesion Segmentation

Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning

Semi-weakly Supervised Learning for Prostate Cancer Image Classification with Teacher-Student Deep Convolutional Network