Enhance the recognition ability to occlusions and small objects with Robust Faster R-CNN

  • PDF / 2,004,252 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 54 Downloads / 182 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Enhance the recognition ability to occlusions and small objects with Robust Faster R‑CNN Tao Zhou1 · Zhixin Li1 · Canlong Zhang1 Received: 29 March 2019 / Accepted: 21 August 2019 © Springer-Verlag GmbH Germany, part of Springer Nature 2019

Abstract Recognizing objects with vastly different size scales and objects with occlusions is a fundamental challenge in computer vision. This paper addresses this issue by proposing a novel approach denoted as Robust Faster R-CNN for detecting objects in multi-label images. Robust Faster R-CNN employs a cascaded network structure based on the Faster R-CNN architecture to extract features from objects with different size scales. However, the proposed design provides greater robustness than Faster R-CNN by replacing the RoIPooling operation with RoIAligns to eliminate the harsh quantization conducted by RoIPooling, and we design a multi-scale RoIAligns operation by adding multiple pool sizes for adapting the detection ability of the network to objects with different sizes. Furthermore, we combine an adversarial network with the proposed network to generate training samples with occlusions significantly affecting the classification ability of the model, which improves its robustness to occlusions. Experimental results for the PASCAL VOC 2012 and 2007 datasets demonstrate the superiority of the proposed object detection approach relative to several state-of-the-art approaches. Keywords  Object detection · Robust Faster R-CNN · Multi-cascaded network · Adversarial network · Feature fusion

1 Introduction Object detection is one of the fundamental problems in computer vision that has been substantially addressed due to the great advances in deep learning over the past few years. It is well known that prevalent object detectors mostly regard detection as a problem of classifying candidate boxes[4, 5, 17]. This has led to the increasingly successful of the application of CNN(convolutional neural networks) in image recognition tasks [18, 25–27]. As a result, an increasing number of novel object detection methods based on CNNs [2, 10, 19] have been proposed. These structurally diverse frameworks have improved the accuracy of object detection to a certain degree, and many have achieved real-time performance for many benchmark datasets. However, images typically contain occlusions and small objects to which most current object detection methods are not sensitive. Insensitivity to these objects will inevitably restrict the accuracy of object * Zhixin Li [email protected] 1



Guangxi Key Lab of Multi‑source Information Mining and Security, Guangxi Normal University, Guilin 541004, China

detection. Therefore, the development of detection methods that are sensitive to occlusions and small objects in images is a key problem that must be addressed to provide more robust object detection. In general, the problem associated with small object detection is actually a problem involving the detection of objects with vastly different size scales, which is a very common problem in o