MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking-Reference-Cited by-同舟云学术

MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking

Published:2020-12-23 Issue:4 Volume:129 Page:845-881
ISSN:0920-5691
Container-title:International Journal of Computer Vision
language:en
Short-container-title:Int J Comput Vis

Author:

Dendorfer Patrick^ORCID,Os̆ep Aljos̆a,Milan Anton,Schindler Konrad,Cremers Daniel,Reid Ian,Roth Stefan,Leal-Taixé Laura

Abstract

AbstractStandardized benchmarks have been crucial in pushing the performance of computer vision algorithms, especially since the advent of deep learning. Although leaderboards should not be over-claimed, they often provide the most objective measure of performance and are therefore important guides for research. We presentMOTChallenge, a benchmark for single-camera Multiple Object Tracking (MOT) launched in late 2014, to collect existing and new data and create a framework for the standardized evaluation of multiple object tracking methods. The benchmark is focused on multiple people tracking, since pedestrians are by far the most studied object in the tracking community, with applications ranging from robot navigation to self-driving cars. This paper collects the first three releases of the benchmark: (i)MOT15, along with numerous state-of-the-art results that were submitted in the last years, (ii)MOT16, which contains new challenging videos, and (iii)MOT17, that extendsMOT16sequences with more precise labels and evaluates tracking performance on three different object detectors. The second and third release not only offers a significant increase in the number of labeled boxes, but also provide labels for multiple object classes beside pedestrians, as well as the level of visibility for every single object of interest. We finally provide a categorization of state-of-the-art trackers and a broad error analysis. This will help newcomers understand the related work and research trends in the MOT community, and hopefully shed some light into potential future research directions.

Funder

Technische Universität München

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Link

http://link.springer.com/content/pdf/10.1007/s11263-020-01393-0.pdf

Reference157 articles.

1. Alahi, A., Ramanathan, V., & Fei-Fei, L. (2014). Socially-aware large-scale crowd forecasting. In Conference on computer vision and pattern recognition.

2. Andriluka, M., Roth, S., & Schiele, B. (2010). Monocular 3D pose estimation and tracking by detection. In Conference on computer vision and pattern recognition.

3. Andriluka, M., Iqbal, U., Insafutdinov, E., Pishchulin, L., Milan, A., Gall, J., & Schiele, B. (2018). Posetrack: A benchmark for human pose estimation and tracking. In Conference on computer vision and pattern recognition.