Position-aware lightweight object detectors with depthwise separable convolutions
- PDF / 2,050,298 Bytes
- 15 Pages / 595.276 x 790.866 pts Page_size
- 65 Downloads / 187 Views
ORIGINAL RESEARCH PAPER
Position‑aware lightweight object detectors with depthwise separable convolutions Libo Chang1,2 · Shengbing Zhang1 · Huimin Du2 · Zhonglun You2 · Shiyu Wang1 Received: 1 February 2020 / Accepted: 1 October 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract Recently, significant improvements have been achieved for object detection algorithm by increasing the size of convolutional neural network (CNN) models, but the resulting increase of computational complexity poses an obstacle to practical applications. And some of the lightweight methods fail to consider the characteristics of object detection into and suffer a huge loss of accuracy. In this paper, we design a multi-scale feature lightweight network structure and specific convolution module for object detection based on depthwise separable convolution, which not only reduces the computational complexity but also improves the accuracy by using the specific position information in object detection. Furthermore, in order to improve the detection accuracy for small objects, we construct a multi-channel position-aware map and propose training based on knowledge distillation for object detection to train the lightweight model effectively. Last, we propose a training strategy based on a key-layer guiding structure to balance performance with training time. The experimental results show that on the COCO dataset that takes the state-of-the-art object detection algorithm, YOLOv3, as the baseline, our model size is compressed to 1/11 while accuracy drops by 7.4 mmAP, and the computational latency on the GPU and ARM platforms are reduced to 43.7% and 0.29%, respectively. Compared with the state-of-the-art lightweight object detection model, MNet V2 + SSDLite, the accuracy of our model increases by 3.5 mmAP while the inferencing time stays nearly the same. On the PASCAL VOC2007 dataset, the accuracy of our model increases by 5.2 mAP compared to the state-of-the-art lightweight algorithm based on knowledge distillation. Therefore, in terms of accuracy, parameter count, and real-time performance, our algorithm has better performance than lightweight algorithms based on knowledge distillation or depthwise separable convolution. Keywords Object detection · Knowledge distillation · Depthwise separable convolution · Attention model
1 Introduction * Libo Chang [email protected] Shengbing Zhang [email protected] Huimin Du [email protected] Zhonglun You [email protected] Shiyu Wang [email protected] 1
School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China
School of Electronic Engineering, Xi’an University of Posts and Telecommunication, Xi’an 710121, China
2
Object detection is a technology related to computer vision that deals with detecting and localizing objects of interest within a scene and assigning a class label to each of these objects of interest [1]. Object detection consists of three main steps: region proposal, feature representation, and region classification.
Data Loading...