The detection, tracking, and temporal action localisation of swimmers for automated analysis

PDF / 1,247,368 Bytes
19 Pages / 595.276 x 790.866 pts Page_size
33 Downloads / 185 Views

(0123456789().,-volV)(0123456789(). ,- volV)

ORIGINAL ARTICLE

The detection, tracking, and temporal action localisation of swimmers for automated analysis Ashley Hall1 • Brandon Victor1 • Zhen He1 Stuart Morgan4

•

Matthias Langer2 • Marc Elipot3 • Aiden Nibali1

•

Received: 17 December 2019 / Accepted: 27 October 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract It is very important for swimming coaches to analyse a swimmer’s performance at the end of each race, since the analysis can then be used to change strategies for the next round. Coaches rely heavily on statistics, such as stroke length and instantaneous velocity, when analysing performance. These statistics are usually derived from time-consuming manual video annotations. To automatically obtain the required statistics from swimming videos, we need to solve the following four challenging computer vision tasks: swimmer head detection; tracking; stroke detection; and camera calibration. We collectively solve these problems using a two-phased deep learning approach, we call Deep Detector for Actions and Swimmer Heads (DeepDASH). DeepDASH achieves a 20.8% higher F1 score for swimmer head detection and operates 6 times faster than the popular Faster R-CNN object detector. We also propose a hierarchical tracking algorithm based on the existing SORT algorithm which we call HISORT. HISORT produces significantly longer tracks than SORT by preserving swimmer identities for longer periods of time. Finally, DeepDASH achieves an overall F1 score of 97.5% for stroke detection across all four swimming stroke styles. Keywords Object detection Tracking Temporal action recognition Deep learning Convolutional neural networks

1 Introduction Swimming coaches depend on stroke count, instantaneous velocity, stroke rate and distance per stroke per sections of the race to analyse the performance of athletes. Currently, We thank the Australian Institute of Sports, Swimming Australia and Optus for providing the research innovation grant used to carry out this research. & Zhen He [email protected] Ashley Hall [email protected] Brandon Victor [email protected]

these statistics are manually annotated from video at great expense. Using computer vision algorithms to automatically annotate races will significantly scale up the annotation of swimmers. Using an algorithm to determine the real-world coordinates of swimmers can also improve consistency as manually estimating velocity from a 2D image is error-prone.

1

Department of Computer Science, La Trobe University, Bundoora, Australia

2

Career Science Lab (CSL), BOSS ZhiPin, Metzingen, Germany

3

Swimming Australia, Canberra, Australia

4

Australian Institute of Sport, Canberra, Australia

Matthias Langer [email protected] Marc Elipot [email protected] Aiden Nibali [email protected] Stuart Morgan [email protected]

123

Neural Computing and Applications

In this project we take a video of the entire pool and use computer vision algorithms to automati

Data Loading...

The detection, tracking, and temporal action localisation of swimmers for automated analysis

Recommend Documents

Multi-level Temporal Pyramid Network for Action Detection

Automated Localisation Testing in Industry with Test\(^*\)

Automated Workflow Analysis and Tracking Using Radio Frequency Identification Technology

Trajectory Tracking for Automated Driving Functions

Temporal Distinct Representation Learning for Action Recognition

Automated Shape and Texture Analysis for Detection of Osteoarthritis from Radiographs of the Knee

Human Detection and Tracking

Topic Detection and Tracking

Particle Filter Design Using Importance Sampling for Acoustic Source Localisation and Tracking in Reverberant Environmen

Boundary discrimination and proposal evaluation for temporal action proposal generation

Automated Temporal Verification of Integrated Dependent Effects

Detection and Tracking of Humans and Faces