Author:
Yang Fan,Odashima Shigeyuki,Yamao Sosuke,Fujimoto Hiroaki,Masui Shoichi,Jiang Shan
Abstract
AbstractDespite significant developments in 3D multi-view multi-person (3D MM) tracking, current frameworks separately target footprint tracking, or pose tracking. Frameworks designed for the former cannot be used for the latter, because they directly obtain 3D positions on the ground plane via a homography projection, which is inapplicable to 3D poses above the ground. In contrast, frameworks designed for pose tracking generally isolate multi-view and multi-frame associations and may not be sufficiently robust for footprint tracking, which utilizes fewer key points than pose tracking, weakening multi-view association cues in a single frame. This study presents a unified multi-view multi-person tracking framework to bridge the gap between footprint tracking and pose tracking. Without additional modifications, the framework can adopt monocular 2D bounding boxes and 2D poses as its input to produce robust 3D trajectories for multiple persons. Importantly, multi-frame and multi-view information are jointly employed to improve association and triangulation. Our framework is shown to provide state-of-the-art performance on the Campus and Shelf datasets for 3D pose tracking, with comparable results on the WILDTRACK and MMPTRACK datasets for 3D footprint tracking.
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Vision and Pattern Recognition
Reference51 articles.
1. Black, J.; Ellis, T. Multi camera image tracking. Image and Vision Computing Vol. 24, No. 11, 1256–1267, 2006.
2. Sternig, S.; Mauthner, T.; Irschara, A.; Roth, P. M.; Bischof, H. Multi-camera multi-object tracking by robust hough-based homography projections. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 1689–1696, 2011.
3. He, Y. H.; Wei, X.; Hong, X. P.; Shi, W. W.; Gong, Y. H. Multi-target multi-camera tracking by tracklet-to-target assignment. IEEE Transactions on Image Processing Vol. 29, 5191–5205, 2020.
4. Chen, H.; Guo, P. F.; Li, P. F.; Lee, G. H.; Chirikjian, G. Multi-person 3D pose estimation in crowded scenes based on multi-view geometry. In: Computer Vision–ECCV 2020. Lecture Notes in Computer Science, Vol. 12348. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 541–557, 2020.
5. Chen, L.; Ai, H. Z.; Chen, R.; Zhuang, Z. J.; Liu, S. Cross-view tracking for multi-human 3D pose estimation at over 100 FPS. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3276–3285, 2020.