Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

This paper proposes a novel saliency detection method by combining region-level saliency estimation and pixel-level saliency prediction with CNNs (denoted as CRPSD). For pixel-level saliency prediction, a fully convolutional neural network (called pixel-l

PDF / 4,125,292 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
61 Downloads / 225 Views

DOWNLOAD

REPORT

Abstract. This paper proposes a novel saliency detection method by combining region-level saliency estimation and pixel-level saliency prediction with CNNs (denoted as CRPSD). For pixel-level saliency prediction, a fully convolutional neural network (called pixel-level CNN) is constructed by modifying the VGGNet architecture to perform multiscale feature learning, based on which an image-to-image prediction is conducted to accomplish the pixel-level saliency detection. For regionlevel saliency estimation, an adaptive superpixel based region generation technique is ﬁrst designed to partition an image into regions, based on which the region-level saliency is estimated by using a CNN model (called region-level CNN). The pixel-level and region-level saliencies are fused to form the ﬁnal salient map by using another CNN (called fusion CNN). And the pixel-level CNN and fusion CNN are jointly learned. Extensive quantitative and qualitative experiments on four public benchmark datasets demonstrate that the proposed method greatly outperforms the state-of-the-art saliency detection approaches. Keywords: Saliency detection · Convolutional neural network · Regionlevel saliency estimation · Pixel-level saliency prediction · Saliency fusion

1

Introduction

Visual saliency detection, which is an important and challenging task in computer vision, aims to highlight the most important object regions in an image. Numerous image processing applications incorporate the visual saliency to improve their performance, such as image segmentation [1] and cropping [2], object detection [3], and image retrieval [4], etc. The main task of saliency detection is to extract discriminative features to represent the properties of pixels or regions and use machine learning algorithms to compute salient scores to measure their importances. A large number of saliency detection approaches [5–36] have been proposed by exploiting diﬀerent salient cues recently. They can be roughly categorized as pixel based approaches and region based approaches. For the pixel based approaches, the local and global features, including edges [5], color diﬀerence [36], spatial information [6], distance transformation [30], and so on, are extracted from pixels for saliency detection. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part VIII, LNCS 9912, pp. 809–825, 2016. DOI: 10.1007/978-3-319-46484-8 49

810

Y. Tang and X. Wu

(a)

(b)

(c)

(d)

(e)

(f)

(g)

Fig. 1. Three examples of saliency detection results estimated by the proposed method and the state-of-the-art approaches. (a) The input images. (b) The ground truths. (c) The salient maps detected by the proposed method. (d)-(g) The salient maps detected by the state-of-the-art approaches MC [26], MDF [21], LEGS [28], and MB+ [30].

Generally, these approaches highlight high contrast edges instead of the salient objects, or get low contrast salient maps. That is because the extracted features are unable to capture the high-level and multi-scale information of pixels. As we know

Data Loading...

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

Recommend Documents

Saliency Region Detection via Graph Model and Statistical Learning

Saliency Detection via Manifold Ranking Based on Robust Foreground

Video Denoising by Combining Patch Search and CNNs

Gradient-Induced Co-Saliency Detection

A novel attention fusion network-based framework to ensemble the predictions of CNNs for lymph node metastasis detection

Kernelized Subspace Ranking for Saliency Detection

Logo detection using weakly supervised saliency map

Fall detection based on fused saliency maps

Heterogeneous Face Recognition with CNNs

A very high-resolution scene classification model using transfer deep CNNs based on saliency features

Enriched Feature Representation and Combination for Deep Saliency Detection

Co-learning saliency detection with coupled channels and low-rank factorization