Multi-scale ResNet for real-time underwater object detection

PDF / 1,406,198 Bytes
9 Pages / 595.276 x 790.866 pts Page_size
108 Downloads / 238 Views

ORIGINAL PAPER

Multi-scale ResNet for real-time underwater object detection Tien-Szu Pan1 · Huang-Chu Huang2 · Jen-Chun Lee2 · Chung-Hsien Chen3 Received: 16 July 2020 / Revised: 2 November 2020 / Accepted: 6 November 2020 © Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract An automatic underwater object recognition system is essential to reduce the costs of underwater inspection. In this study, we propose a novel convolutional neural network architecture that is trained on underwater video frames. This method is based on a modified residual neural network (ResNet) for underwater object detection. Multi-scale ResNet (M-ResNet), the modified method, improves efficiency by utilizing multi-scale operations for the accurate detection of objects of various sizes, especially small objects. The experimental results show that the proposed method yields an accuracy of 96.5% (mAP) in recognition performance. As a consequence, we propose a novel system for automatic object detection as an application for marine environments. Keywords Marine object recognition · Deep learning · Convolutional neural network · Residual neural network

1 Introduction Object recognition is now a popular research topic in the computer vision field. The traditional process of object recognition includes preprocessing, feature extraction, and classification. Most work on object detection is based on traditional algorithms from before 2012. With the increasing amount of data and computing resources, traditional object detection methods have been replaced by artificial intelligence (AI) [1, 2]. Various CNN architectures have been applied broadly to computer vision because of the success of these architectures in object recognition tasks such as region-based convolutional neural networks (R-CNN) [3], single shot multibox detectors (SSD) [4], you only look once (YOLO) [5], and deep residual networks (ResNet) [6]. In this paper, we focus on a fast and efficient neural network architecture from the YOLO and ResNet families. The newer YOLOv4 [7] architecture boasts residual shortcut connections and upsampling. Upsampled layers concatenate

B

Huang-Chu Huang [email protected]

1

Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan

2

Department of Telecommunication Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan

3

Metal Industries Research & Development Centre (MIRDC), Kaohsiung, Taiwan

previous layers that help preserve fine-grained features which help in detecting small objects (object of size less than of 1% of the total image area). Moreover, ResNet [6] makes the training process faster and is more accurate compared to equivalent neural networks; it achieves this improvement by adding simple shortcut connections that allow a signal to bypass a layer and move to the next layer in the sequence. However, even though ResNet methods offer better detection performance than others, they still do not detect small objects, especially in v

Data Loading...

Multi-scale ResNet for real-time underwater object detection

Recommend Documents

Dual Refinement Underwater Object Detection Network

A pattern analysis based underwater video segmentation system for target object detection

Improved SSD for Object Detection

Mixture Models for Object Detection

Noise Resistant Focal Loss for Object Detection

Balanced Loss for Accurate Object Detection

Recursive Context Routing for Object Detection

Dive Deeper into Box for Object Detection

Bidirectional Non-local Networks for Object Detection

Improving Accuracy and Efficiency of Object Detection Algorithms Using Multiscale Feature Aggregation Plugins

Pillar-Based Object Detection for Autonomous Driving

Pyramid context learning for object detection