An Improved End-to-End Multi-Target Tracking Method Based on Transformer Self-Attention-Reference-Cited by-同舟云学术

An Improved End-to-End Multi-Target Tracking Method Based on Transformer Self-Attention

Published:2022-12-15 Issue:24 Volume:14 Page:6354
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Hong Yong^ORCID,Li Deren,Luo Shupei,Chen Xin,Yang Yi,Wang Mi

Abstract

Current multi-target multi-camera tracking algorithms demand increased requirements for re-identification accuracy and tracking reliability. This study proposed an improved end-to-end multi-target tracking algorithm that adapts to multi-view multi-scale scenes based on the self-attentive mechanism of the transformer’s encoder–decoder structure. A multi-dimensional feature extraction backbone network was combined with a self-built raster semantic map which was stored in the encoder for correlation and generated target position encoding and multi-dimensional feature vectors. The decoder incorporated four methods: spatial clustering and semantic filtering of multi-view targets; dynamic matching of multi-dimensional features; space–time logic-based multi-target tracking, and space–time convergence network (STCN)-based parameter passing. Through the fusion of multiple decoding methods, multi-camera targets were tracked in three dimensions: temporal logic, spatial logic, and feature matching. For the MOT17 dataset, this study’s method significantly outperformed the current state-of-the-art method by 2.2% on the multiple object tracking accuracy (MOTA) metric. Furthermore, this study proposed a retrospective mechanism for the first time and adopted a reverse-order processing method to optimize the historical mislabeled targets for improving the identification F1-score (IDF1). For the self-built dataset OVIT-MOT01, the IDF1 improved from 0.948 to 0.967, and the multi-camera tracking accuracy (MCTA) improved from 0.878 to 0.909, which significantly improved the continuous tracking accuracy and reliability.

Funder

The Key Research & Development of Hubei Province

The Natural Science Foundation of Hubei Province

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Link

https://www.mdpi.com/2072-4292/14/24/6354/pdf

Reference45 articles.

1. Liu, S., Kong, W., Chen, X., Xu, M., Yasir, M., Zhao, L., and Li, J. (2022). Multi-Scale Ship Detection Algorithm Based on a Lightweight Neural Network for Spaceborne SAR Images. Remote Sens., 14.

2. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020). European Conference on Computer Vision, Springer.

3. Meinhardt, T., Kirillov, A., Leal-Taixé, L., and Feichtenhofer, C. (2022). TrackFormer: Multi-Object Tracking with Transformers. coRR.

4. Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., and Wei, Y. (2022). European Conference on Computer Vision, Springer.

5. Multi-Target Multi-Camera Tracking by Tracklet-to-Target Assignment;He;IEEE Trans. Image Process.,2020

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. AReID: Rethinking Re-Identification and Occlusions for Multi-Object Tracking;2023 International Conference on Machine Learning and Applications (ICMLA);2023-12-15

2. NLOS Error Suppression Method based on UWB Indoor Positioning;2023 IEEE International Conference on Mechatronics and Automation (ICMA);2023-08-06

3. Global-Local and Occlusion Awareness Network for Object Tracking in UAVs;IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing;2023