SCGTracker: object feature embedding enhancement based on graph attention networks for multi-object tracking-Reference-Cited by-同舟云学术

SCGTracker: object feature embedding enhancement based on graph attention networks for multi-object tracking

Published:2024-05-11 Issue:4 Volume:10 Page:5513-5527
ISSN:2199-4536
Container-title:Complex & Intelligent Systems
language:en
Short-container-title:Complex Intell. Syst.

Author:

Feng Xin^ORCID,Jiao Xiaoning,Wang Siping,Zhang Zhixian,Liu Yan

Abstract

AbstractMulti-object tracking (MOT) is a task to identify objects in videos, however, objects with similar appearance or occlusion may cause frequent ID switching, which is the main challenge of current MOT. In this paper, we propose a novel self-cross graph neural network-based multi-object tracking method, which we termed as SCGTracker. This method seamlessly integrates object detection and tracking through graph neural networks, building upon the foundation of the JDE paradigm. Specifically, we construct graph structures to capture the correlation between objects in both spatial and temporal dimensions. To further tackle the frequent ID switching problem, we employ an attention mechanism to aggregate object context information within the same frame and across different frames, updating the object information via graph neural networks to derive highly distinctive appearance features. Ultimately, the obtained strongly distinguishable object appearance features serve to mitigate the issue of frequent object ID switches. In experiments conducted on the MOT17 test set, our proposed method yields promising results, achieving a 73% Multiple Object Tracking Accuracy (MOTA) and a 73.2% ID F1 score. Furthermore, it demonstrates a substantial reduction in ID switches compared with state-of-the-art methods.

Funder

the Key project of Chongqing Technology Innovation and Application Development

Natural Science Foundation of Chongqing,China

Chongqing Postgraduate Scientific Research Innovation Project and the Action Plan for the High-quality Development of Postgraduate Education of Chongqing University of Technology

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s40747-024-01426-y.pdf

Reference43 articles.

1. Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J Image Video Process 2008:1–10. https://doi.org/10.1007/s40747-020-00206-8

2. Bewley A, Ge Z, Ott L, et al (2016) Simple online and real-time tracking. In: 2016 IEEE International Conference on image processing (ICIP), pp. 3464–3468. https://doi.org/10.1109/ICIP.2016.7533003

3. Bras ‘o G, Leal-Taix’e L (2020) Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, 2020, pp 6247–6257. https://doi.org/10.48550/arXiv.1912.07515.

4. Chu P, Wang J, You Q, et al (2023) Transmot: Spatial-temporal graph transformer for multiple object tracking. In: Proceedings of the IEEE/CVF Winter Conference on applications of computer vision, 2023, pp 4870–4880. https://doi.org/10.48550/arXiv.2104.00194

5. Dendorfer P, Rezatofighi H, Milan A, et al (2020) Mot20: A benchmark for multi-object tracking in crowded scenes. arXiv preprint arXiv:2003.09003