A De-raining semantic segmentation network for real-time foreground segmentation

  • PDF / 3,919,894 Bytes
  • 15 Pages / 595.276 x 790.866 pts Page_size
  • 77 Downloads / 196 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH PAPER

A De‑raining semantic segmentation network for real‑time foreground segmentation Fanyi Wang1 · Yihui Zhang2  Received: 19 June 2020 / Accepted: 18 October 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Few researches have been proposed specifically for real-time semantic segmentation in rainy environments. However, the demand in this area is huge and it is challenging for lightweight networks. Therefore, this paper proposes a lightweight network which is specially designed for the foreground segmentation in rainy environments, named De-raining Semantic Segmentation Network (DRSNet). By analyzing the characteristics of raindrops, the MultiScaleSE Block is targetedly designed to encode the input image, it uses multi-scale dilated convolutions to increase the receptive field, and SE attention mechanism to learn the weights of each channels. To combine semantic information between different encoder and decoder layers, it is proposed to use Asymmetric Skip, that is, the higher semantic layer of encoder employs bilinear interpolation and the output passes through pointwise convolution, then added element-wise to the lower semantic layer of the decoder. According to the control experiments, the performances of MultiScaleSE Block and Asymmetric Skip compared with SEResNet18 and Symmetric Skip respectively are improved to a certain degree on the Foreground Accuracy index. The parameters and the floating point of operations (FLOPs) of DRSNet are only 0.54M and 0.20GFLOPs separately. The state-of-the-art results and real-time performances are achieved on both the UESTC all-day Scenery add rain (UAS-add-rain) and the Baidu People Segmentation add rain (BPS-add-rain) benchmarks with the input sizes of 192*128, 384*256 and 768*512. The speed of DRSNet exceeds all the networks within 1GFLOPs, and Foreground Accuracy index is also the best among the similar magnitude networks on both benchmarks. Keywords  Real-time · Rainy environments · Foreground segmentation · Encoder-decoder · Lightweight network

1 Introduction Currently, semantic segmentation networks emerge in an endless stream and have been widely used in production and life. The current networks are mainly developing towards two directions of “being lighter and faster under the premise of certain performance” and “breaking through the current performance indicators”. In recent years, with the advance of AI technology, the update and iteration of lightweight * Fanyi Wang [email protected] Yihui Zhang [email protected] 1



State Key Laboratory of Modern Optical Instrumentation, Zhejiang University, Hangzhou 310027, China



School of Mechatronics Engineering, Henan University of Science and Technology, 263 Kaiyuan Avenue, Luoyang, China

2

networks [1–13] have become more and more rapid. For example, ENet [1], CGNet [2], ContextNet [3], LEDNet [4], DFANet [5], FDDWNet [6] and so on, all committed to achieve a balance between accuracy and model complexity. The design of lightweight networks mainly has the following