C-FCN: Corners-based fully convolutional network for visual object detection

PDF / 2,463,679 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
52 Downloads / 186 Views

C-FCN: Corners-based fully convolutional network for visual object detection Lin Jiao 1,2 & Rujing Wang 1 & Chengjun Xie 1 Received: 16 March 2020 / Revised: 27 June 2020 / Accepted: 29 July 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

Object detection has achieved significantly progresses in recent years. Proposal-based methods have become the mainstream object detectors, achieving excellent performance on accurate recognition and localization of objects. However, region proposal generation is still a bottleneck. In this paper, to address the limitations of conventional region proposal network (RPN) that defines dense anchor boxes with different scales and aspect ratios, we propose an anchorfree proposal generator named corner region proposal network (CRPN) which is based on a pair of key-points, including top-left corner and bottom-right corner of an object bounding box. First, we respectively predict the top-left corners and bottom-right corners by two sibling convolutional layers, then we obtain a set of object proposals by grouping strategy and nonmaximum suppression algorithm. Finally, we further merge CRPN and fully convolutional network (FCN) into a unified network, achieving an end-to-end object detection. Our method has been evaluated on standard PASCAL VOC and MS COCO datasets using a deep residual network. Experiment results present that the proposed method outperforms previous detectors in the term of precision. Additionally, it runs with a speed of 76 ms per image on a single GPU by using ResNet-50 as the backbone, which is faster than other detectors. Keywords Object detection . Anchor-free . Corners . Region proposals . Fully convolutional network

1 Introduction Detecting objects is one of the essential computer vision tasks, aiming to localize and identify objects from images and videos [32]. It is the basis of many other computer vision tasks, such

* Chengjun Xie [email protected]

1

Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Science, Hefei 230031, China

2

University of Science and Technology of China, Hefei 230026, China

Multimedia Tools and Applications

as instance segmentation [47, 20], object tracking [28, 22], and image captioning [43, 9]. Recently, with the development of deep learning, object detection tasks achieve remarkable breakthroughs and attract much attention of research. Now, it has many practical applications, for example, face detection [46, 45], autonomous driving [26], medical diagnosis [36, 39], etc. In early times, most object detectors depend on hand-crafted features. Due to the lack of effective feature representation, researchers have to design complex approaches to improve the capability of image presentation. The most important one is the Histogram of Oriented Gradients (HOG) [8] feature descriptor which is viewed as a vital improvement of the scaleinvariant feature transform [34, 33] and shape contexts [3] at that time. And it has become a cornerstone of numerous object detect

Data Loading...

C-FCN: Corners-based fully convolutional network for visual object detection

Recommend Documents

Fully-Convolutional Siamese Networks for Object Tracking

Object Detection with Convolutional Neural Networks

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

Visual Compositional Learning for Human-Object Interaction Detection

SiamMN: Siamese modulation network for visual object tracking

A Lightweight Fully Convolutional Neural Network of High Accuracy Surface Defect Detection

Dual Refinement Underwater Object Detection Network

A novel hardware-oriented ultra-high-speed object detection algorithm based on convolutional neural network

Image Orientation Detection Using Convolutional Neural Network

Object affordance detection with relationship-aware network

Deep Convolutional Neural Network for Microseismic Signal Detection and Classification

Polysemy Deciphering Network for Human-Object Interaction Detection