Scale channel attention network for image segmentation

PDF / 7,502,963 Bytes
17 Pages / 439.642 x 666.49 pts Page_size
65 Downloads / 435 Views

Scale channel attention network for image segmentation Jianjun Chen1,2 · Youliang Tian3 · Wei Ma1 · Zhengdong Mao1 · Yue Hu1 Received: 7 May 2019 / Revised: 18 February 2020 / Accepted: 7 April 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract The object scale variation results in a negative effect on image segmentation performance. Spatial pyramid pooling module or the attention mechanism are two widely used components in deep neural networks to handle this problem. Applying the single component commonly achieves limited benefit. To push the limit, in this paper, we propose a scale channel attention network (SCA-Net), which enhances the fusion feature of multi-scale by using channel attention components. After the multiple-scale pooling step, the multiscale spatial information distributes in different feature channels. Meanwhile, the channel attention block is employed to guide SCA-Net focus on the object-relevant scale channels. We further explore the channel attention block and find a simple yet effective structure to combine global average pooling and global maximum pooling, resulting in a robust global information encoder. The SCA-Net does not contain any time-consuming post-processing, which is an extra step after the neural network for the segmentation result optimization. The assessment results on PASCAL VOC 2012 and Cityscapes benchmarks achieve the test set performance of 75.5% and 77.0%. Keywords Image segmentation · Convolutional neural network · Attention mechanism · Spatial pyramid pooling · Multi-source and heterogeneous data

1 Introduction Image segmentation is an essential topic in image content understanding and has a broad prospect on image editing, auto driving and multi-source and heterogeneous image analytics. Recently, the state-of-the-art results of image segmentation are mainly achieved by convolutional neural networks (CNN) [24], [27], [21] and [4]. Although the researchers have Youliang Tian

[email protected] 1

National Engineering Laboratory for Information Security Technologies, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China

2

School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China

3

Guizhou Provincial Key Laboratory of Public Big Data, College of Computer Science and Technology, GuiZhou University, Guiyang, Guizhou, 550025, China

Multimedia Tools and Applications

made progress in the task, how to effectively fuse global and local features and improve the scale-robustness of CNNs is still hard. Especially for the multi-source image, the condition of object scale variance is very complex. As shown in Fig. 1, there is large-scale stuff/things like cars and road and the tiny-scale object like pedestrians. Generally, if a CNN has a larger receptive field, it will gain more global information and generates a better representation for the large-scale staff/things. For the tiny-scale object, a larger receptive field also captures more global information, however, which contain

Data Loading...

Scale channel attention network for image segmentation

Recommend Documents

Attention-Based Network for Semantic Image Segmentation via Adversarial Learning

Routing Attention Shift Network for Image Classification and Segmentation

Supervised Edge Attention Network for Accurate Image Instance Segmentation

Attention augmented multi-scale network for single image super-resolution

Pay More Attention to Discontinuity for Medical Image Segmentation

High-Order Attention Networks for Medical Image Segmentation

An Attention Enhanced Graph Convolutional Network for Semantic Segmentation

PraNet: Parallel Reverse Attention Network for Polyp Segmentation

Residual Spatial Attention Network for Retinal Vessel Segmentation

Cascaded Attention Guided Network for Retinal Vessel Segmentation

Multi-scale Attention Consistency for Multi-label Image Classification

Res2U-Net: Image Inpainting via Multi-scale Backbone and Channel Attention