Two-branch encoding and iterative attention decoding network for semantic segmentation

PDF / 2,830,313 Bytes
16 Pages / 595.276 x 790.866 pts Page_size
25 Downloads / 221 Views

(0123456789().,-volV)(0123456789(). ,- volV)

ORIGINAL ARTICLE

Two-branch encoding and iterative attention decoding network for semantic segmentation Hegui Zhu1 • Min Zhang1 • Xiangde Zhang1

•

Libo Zhang2

Received: 9 March 2020 / Accepted: 19 August 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract Deep convolutional neural networks(DCNNs) have shown outstanding performance in semantic image segmentation. In this paper, we propose a two-branch encoding and iterative attention decoding semantic segmentation model. In encoding stage, an improved PeleeNet is used as the backbone branch to extract dense image features, and the spatial branch is used to preserve fine-grained information. In decoding stage, the iterative attention decoding is employed to optimize the segmentation results with multi-scale features. Furthermore, we propose a channel position attention module and a boundary residual attention module to learn different position and boundary features, which can enrich the target boundary position information. Finally, we use SegNet as the basic network and conduct some experiments to evaluate the effect of each component in the proposed model with accuracy and mIOU on CamVid dataset. Furthermore, we verify the segmentation performance of the proposed model with comparable experiments on CamVid, Cityscapes and PASCAL VOC 2012 dataset. In particular, the model has achieved 91.7% segmentation accuracy and 67.1% mIOU on the CamVid dataset respectively, which verify the effectiveness of our proposed model. In the future, we can combine target detection with semantic segmentation to further improve the semantic segmentation effect of small objects. We also hope to further optimize the model structure and reduce its time complexities and parameters under the guarantee of effectiveness. Keywords Semantic segmentation Two-branch encoding Improved PeleeNet Iterative attention decoding Channel position attention Boundary residual attention

1 Introduction Semantic image segmentation is often used in scene understanding [1–3], object detection [4–6] and autonomous driving [7], which plays a significant role in computer vision. Recently, deep convolutional neural & Xiangde Zhang [email protected] Hegui Zhu [email protected] Min Zhang [email protected] Libo Zhang [email protected] 1

College of Sciences, Northeastern University, Shenyang 110819, China

2

Department of radiology, The General Hospital of Northern Theater Command PLA, Shenyang 110016, China

networks(DCNNs) have achieved significant success and extensive applications in image classification [8–11], but they have some limitations when solving dense prediction tasks. In particular, semantic image segmentation need more dense features and spatial information; however, DCNNs such as VGG and ResNet have complex structure and lack dense features. During the encoding stage, with the consecutive pooling layers and strided convolutions, the input image will lose fine-grained image structure, global conte

Data Loading...

Two-branch encoding and iterative attention decoding network for semantic segmentation

Recommend Documents

Attention-Based Network for Semantic Image Segmentation via Adversarial Learning

EfficientFCN: Holistically-Guided Decoding for Semantic Segmentation

An Attention Enhanced Graph Convolutional Network for Semantic Segmentation

Scale channel attention network for image segmentation

Boundary Enhanced Network for Improved Semantic Segmentation

Adaptive Feature Enhancement Network for Semantic Segmentation

Distributed Encoding, Joint Decoding

Attend and Segment: Attention Guided Active Semantic Segmentation

Routing Attention Shift Network for Image Classification and Segmentation

Adaptive Attention Mechanism Based Semantic Compositional Network for Video Captioning

Supervised Edge Attention Network for Accurate Image Instance Segmentation

Few-Shot Semantic Segmentation with Democratic Attention Networks