DSFMA: deeply supervised fully convolutional neural networks based on multi-level aggregation for saliency detection

  • PDF / 941,660 Bytes
  • 21 Pages / 439.37 x 666.142 pts Page_size
  • 48 Downloads / 238 Views

DOWNLOAD

REPORT


DSFMA: deeply supervised fully convolutional neural networks based on multi-level aggregation for saliency detection Inam Ullah, et al. [full author details at the end of the article] Received: 29 May 2020 / Revised: 22 September 2020 / Accepted: 19 October 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

In recent years, the emergence of fully convolutional neural networks (FCNs) has delivered significant success in the field of saliency detection. Although the different levels of FCNs layers can hold different types of information for salient object detection, it is still a challenging issue to find a generic method while integrating all relevant information synthetically with multi-level aggregation. In this paper, we present a novel multi-level aggregation method by following a U-shaped architecture of the VGG-16 network. As the shallower layers of FCNs contain the low-level integrated features which are capable of capturing the more details of salient objects, while the more profound layers that hold the high-level integrated features have more contextual information. To exploit all the relevant information, we extend the last four side-outputs of U-Net at the encoder and decoder sides and then utilize the concept of skip and short-connections to incorporate the high-level contextual knowledge with low-level details. Besides, we also integrate the recurrent convolutional layers (RCLs) into our model, which provide more deepness and enhance the capability to integrate more contextual knowledge. At last, we combine all the side-outputs into a final saliency map together for salient object detection. We evaluate the performance of the proposed model on six broadly used saliency detection benchmarks by comparing it with the other 11 state-of-the-art approaches. Experimental outcomes determine that our method achieves a favorable performance for all compared evaluation measures. Keywords Salient object detection . Multi-level . Saliency detection . Short-connections . Skipconnections . Fully convolutional neural network

1 Introduction Saliency detection, which focuses on discovering the most salient objects or parts (i.e., most fabulous and prominent objects) in an image, has achieved significant attention in recent years. Salient object detection has shown an excellent performance in computer vision as a Dr. Jian and Dr. Yin are co-corresponding authors

Multimedia Tools and Applications

preprocessing step, e.g., semantic-segmentation [10], image retrieval [12, 17], object retargeting [24, 25, 57], visual tracking [5, 42], facial-feature detection [21], underwater vision [22, 23] and scene classification [46, 50]. Despite much research in the last two decades has been exploited, salient object detection yet leftovers imperfect research problems. Because there is a wide variability of aspects that can play a different role in describing visual saliency, and it is very tough to collect all hand-tuned features or cues appropriately. In fact, detecting salient objects needs a sem