Transformer-Based Multiple-Object Tracking via Anchor-Based-Query and Template Matching-Reference-Cited by-同舟云学术

Transformer-Based Multiple-Object Tracking via Anchor-Based-Query and Template Matching

Published:2023-12-30 Issue:1 Volume:24 Page:229
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Wang Qinyu¹,Lu Chenxu¹^ORCID,Gao Long¹,He Gang¹

Affiliation:

1. State Key Laboratory of Integrated Service Networks, School of Telecommunications Engineering, Xidian University, No. 2, South Taibai Street, Hi-Tech Development Zone, Xi’an 710071, China

Abstract

Multiple object tracking (MOT) plays an important role in intelligent video-processing tasks, which aims to detect and track all moving objects in a scene. Joint-detection-and-tracking (JDT) methods are thriving in MOT tasks, because they accomplish the detection and data association in a single stage. However, the slow training convergence and insufficient data association limit the performance of JDT methods. In this paper, the anchor-based query (ABQ) is proposed to improve the design of the JDT methods for faster training convergence. By augmenting the coordinates of the anchor boxes into the learnable queries of the decoder, the ABQ introduces explicit prior spatial knowledge into the queries to focus the query-to-feature learning of the JDT methods on the local region, which leads to faster training speed and better performance. Moreover, a new template matching (TM) module is designed for the JDT methods, which enables the JDT methods to associate the detection results and trajectories with historical features. Finally, a new transformer-based MOT method, ABQ-Track, is proposed. Extensive experiments verify the effectiveness of the two modules, and the ABQ-Track surpasses the performance of the baseline JDT methods, TransTrack. Specifically, the ABQ-Track only needs to train for 50 epochs to achieve convergence, while that for TransTrack is 150 epochs.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/24/1/229/pdf

Reference59 articles.

1. Cheng, C.C., Qiu, M.X., Chiang, C.K., and Lai, S.H. (2023, January 2–6). ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France.

2. Pang, J., Qiu, L., Li, X., Chen, H., Li, Q., Darrell, T., and Yu, F. (2021, January 19–25). Quasi-dense similarity learning for multiple object tracking. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online.

3. Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., and Wang, X. (2022, January 23–27). Bytetrack: Multi-object tracking by associating every detection box. Proceedings of the European Conference on Computer Vision, Tel-Aviv, Israel.

4. Woo, S., Park, K., Oh, S.W., Kweon, I.S., and Lee, J.Y. (2022, January 23–27). Tracking by Associating Clips. Proceedings of the European Conference on Computer Vision, Tel-Aviv, Israel.

5. Zhou, X., Yin, T., Koltun, V., and Krähenbühl, P. (2022, January 19–24). Global tracking transformers. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhancing Vehicle-Machine Interface Icon Detection in Automated Testing Through a Three-Stage Machine Vision Verification Algorithm;IEEE Access;2024

2. Оbject recognition system based on the Yolo model and database formation;Ukrainian Journal of Information Technology;2024