Co-learning saliency detection with coupled channels and low-rank factorization

  • PDF / 1,163,909 Bytes
  • 8 Pages / 595.276 x 790.866 pts Page_size
  • 87 Downloads / 134 Views

DOWNLOAD

REPORT


ORIGINAL PAPER

Co-learning saliency detection with coupled channels and low-rank factorization Yuteng Gao1 · Shuyuan Yang2 Received: 16 February 2019 / Revised: 8 February 2020 / Accepted: 30 March 2020 © Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract In this paper, a co-learning saliency detection method is proposed via coupled channels and low-rank factorization, by imitating the structural sparse coding and cooperative processing mechanism of two dorsal “where” and ventral “what” pathways in human vision system (HVS). First, images are partitioned into some superpixels and their structural sparsity are explored to locate pure background from image borders. Second, images are processed by two cortical pathways, to cooperatively learn a “where” feature map and a “what” feature map, by taking the background as the dictionary and using sparse coding errors as an indication of saliency. Finally, two feature maps are integrated to generate saliency map. Because the “where” and “what” feature maps are complementary to each other, our method can highlight the salient region and restrain the background. Some experiments are taken on several public benchmarks and the results show its superiority to its counterparts. Keywords Saliency detection · Co-learning · Cortical pathways · Low-rank factorization

1 Introduction Visual saliency detection has been a hot topic in computer vision community for a wide range of application, such as scenario analysis object detection, image compression and so on. However, saliency detection is a quite complicated task and researchers seek to study it from different perspectives. The available detection methods can be broadly classified into three categories: cognitive models [1, 2], heuristic features-based models [3, 4] and learning-based models [5, 6]. Cognitive model can broaden the horizon of biological underpinnings of visual attention, to comprehend computational principles of this process. Early models mostly focus on explaining the underlying mechanisms of human visual attention by computationally plausible models. As a milestone of cognitive models, Itti model [1] uses a set of linear This work was supported by the National Natural Science Foundation of China (Nos. 61771380, 61906145, U1730109, 91438103, 61771376, 61703328, 91438201, U1701267, 61703328).

B

Shuyuan Yang [email protected] Yuteng Gao [email protected]

1

Northwest Polytechnic University, Xi’an, China

2

Xidian University, Xi’an, China

center-surround difference to spotlight the visual space via color, illumination and orientation features. Motivated by it, Olivier [2] proposes a bottom-up method based on human visual system (HVS), which uses contrast, perception, visual mask and center-surround features to locate salient regions. Many cognitive models are developed in the early stage of saliency detection, inspired by the fundamental working mechanisms of visual attention, where low-level features are explored to deduce the saliency map with manually designed generation rules. Later heuri