Abstract
AbstractMulti-object tracking (MOT) is a task to identify objects in videos, however, objects with similar appearance or occlusion may cause frequent ID switching, which is the main challenge of current MOT. In this paper, we propose a novel self-cross graph neural network-based multi-object tracking method, which we termed as SCGTracker. This method seamlessly integrates object detection and tracking through graph neural networks, building upon the foundation of the JDE paradigm. Specifically, we construct graph structures to capture the correlation between objects in both spatial and temporal dimensions. To further tackle the frequent ID switching problem, we employ an attention mechanism to aggregate object context information within the same frame and across different frames, updating the object information via graph neural networks to derive highly distinctive appearance features. Ultimately, the obtained strongly distinguishable object appearance features serve to mitigate the issue of frequent object ID switches. In experiments conducted on the MOT17 test set, our proposed method yields promising results, achieving a 73% Multiple Object Tracking Accuracy (MOTA) and a 73.2% ID F1 score. Furthermore, it demonstrates a substantial reduction in ID switches compared with state-of-the-art methods.
Funder
the Key project of Chongqing Technology Innovation and Application Development
Natural Science Foundation of Chongqing,China
Chongqing Postgraduate Scientific Research Innovation Project and the Action Plan for the High-quality Development of Postgraduate Education of Chongqing University of Technology
Publisher
Springer Science and Business Media LLC