MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking
- PDF / 1,931,762 Bytes
- 37 Pages / 595.276 x 790.866 pts Page_size
- 91 Downloads / 242 Views
MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking Patrick Dendorfer1 · Aljo˘sa O˘sep1 · Anton Milan2 · Konrad Schindler3 · Daniel Cremers1 · Ian Reid4 · Stefan Roth5 · Laura Leal-Taixé1 Received: 1 May 2020 / Accepted: 14 October 2020 © The Author(s) 2020
Abstract Standardized benchmarks have been crucial in pushing the performance of computer vision algorithms, especially since the advent of deep learning. Although leaderboards should not be over-claimed, they often provide the most objective measure of performance and are therefore important guides for research. We present MOTChallenge, a benchmark for single-camera Multiple Object Tracking (MOT) launched in late 2014, to collect existing and new data and create a framework for the standardized evaluation of multiple object tracking methods. The benchmark is focused on multiple people tracking, since pedestrians are by far the most studied object in the tracking community, with applications ranging from robot navigation to self-driving cars. This paper collects the first three releases of the benchmark: (i) MOT15, along with numerous state-of-theart results that were submitted in the last years, (ii) MOT16, which contains new challenging videos, and (iii) MOT17, that extends MOT16 sequences with more precise labels and evaluates tracking performance on three different object detectors. The second and third release not only offers a significant increase in the number of labeled boxes, but also provide labels for multiple object classes beside pedestrians, as well as the level of visibility for every single object of interest. We finally provide a categorization of state-of-the-art trackers and a broad error analysis. This will help newcomers understand the related work and research trends in the MOT community, and hopefully shed some light into potential future research directions. Keywords Multi-object-tracking · Evaluation · MOTChallenge · Computer vision · MOTA
1 Introduction Communicated by Daniel Scharstein. Anton Milan: Work done prior to joining Amazon.
B
Patrick Dendorfer [email protected] Aljo˘sa O˘sep [email protected] Anton Milan [email protected] Konrad Schindler [email protected] Daniel Cremers [email protected]
Evaluating and comparing single-camera multi-target tracking methods is not trivial for numerous reasons (Milan et al. 2013). Firstly, unlike for other tasks, such as image denoising, the ground truth, i.e., the perfect solution one aims to achieve, is difficult to define clearly. Partially visible, occluded, or cropped targets, reflections in mirrors or windows, and objects that very closely resemble targets all impose intrinsic ambiguities, such that even humans may not agree on one particular ideal solution. Secondly, many different evaluation metrics with free parameters and ambiguous definitions often lead to conflicting quantitative results across 1
Technical University Munich, Munich, Germany
2
Amazon Research, Tübingen, Germany
Stefan Roth [email protected]
3
ETH Zürich, Zurich, Swit
Data Loading...