Single shot object detection with refined feature

PDF / 3,170,536 Bytes
16 Pages / 439.37 x 666.142 pts Page_size
61 Downloads / 253 Views

Single shot object detection with refined feature Xiaojuan Zhang 1

1

1

1

& Changying Wang & Li Cheng & Shuihan Jiang & Junting Qi

1

Received: 21 October 2019 / Revised: 22 July 2020 / Accepted: 28 July 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

Object classification and localization are two significant aspects of object detector based on the Single Shot MultiBox Detector (SSD). In general, the more feature maps there are, the better the object classification performance will be. However, when the information of excessive feature maps are sparse and unnecessary, the performance of object detection is slightly improved or maybe precisely opposite, which is instead harmful to the production of object localization. The performance of object detectors is not only related to the number of feature maps but also relies partly on the bounding box regression and Non-Maximum Suppression (NMS). In this paper, a detector is constructed based on SSD, called Detection with Refined Feature (DRF), involving center map and scale map, the detection loss is reshaped. Our motivation is to improve the accuracy of classification and localization by searching for central points and predicting the scales of the object points. Center map is used to predict the Intersection over Union (IoU) between the prediction box and ground truth box, while scale map considers the relationships among the different scales. Experimental results on both Pascal VOC and MS COCO 2014 instance datasets demonstrate the effectiveness of DRF. Using Darknet53, we achieve an 86.4% mean Average Precision (mAP) on Pascal VOC2007 and an 87.4% mAP on Pascal VOC2007 and VOC2012. On MS COCO, the DRF with ResNet50 still achieves moderate improvement. Keywords Object detection . SSD . Bounding box . Center map . Scale map

Project supported by the Research on Pixel Coordinate Calibration Method for Video by Multi-Mobile Terminal Collaboration (No.CXZX2016029)

* Li Cheng [email protected]

1

College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China

Multimedia Tools and Applications

1 Introduction Object detection not only classifies object categories but also predicts the location of each object. The bounding boxes are utilized for specified object categories and the class label of the precise localization. The redundant bounding boxes are removed by an NMS procedure [19]. Currently, object detection frameworks fall into two categories, that are two-stage detectors and one-stage detectors. Many object detectors have been proposed to improve accuracy and speed. Two-stage detectors commonly achieve better classification performance, while onestage detectors are significantly more time-efficient and have greater applicability to real-time object detection [39]. Two-stage detectors first generate a sparse set of proposals with a proposal generator. Then region classifiers are used to predict the category of the proposed region. One-stage detectors directly make a definite

Data Loading...

Single shot object detection with refined feature

Recommend Documents

HOSENet: Higher-Order Semantic Enhancement for Few-Shot Object Detection

SFSSD: Shallow Feature Fusion Single Shot Multibox Detector

Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors

Interactive Feature Growing for Accurate Object Detection in Megapixel Images

Learning region-guided scale-aware feature selection for object detection

Monocular 3D Object Detection via Feature Domain Adaptation

Glowing Window-Based Feature Extraction Technique for Object Detection

Fire Detection from Images Based on Single Shot MultiBox Detector

Shot Boundary Detection

Video Shot Detection

Video object detection algorithm based on dynamic combination of sparse feature propagation and dense feature aggregatio

Video Shot-Cut Detection