STA: Spatial-Temporal Attention for Large-Scale Video-Based Person Re-Identification-Reference-Cited by-同舟云学术

STA: Spatial-Temporal Attention for Large-Scale Video-Based Person Re-Identification

Published:2019-07-17 Issue: Volume:33 Page:8287-8294
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Fu Yang,Wang Xiaoyang,Wei Yunchao,Huang Thomas

Abstract

In this work, we propose a novel Spatial-Temporal Attention (STA) approach to tackle the large-scale person reidentification task in videos. Different from the most existing methods, which simply compute representations of video clips using frame-level aggregation (e.g. average pooling), the proposed STA adopts a more effective way for producing robust clip-level feature representation. Concretely, our STA fully exploits those discriminative parts of one target person in both spatial and temporal dimensions, which results in a 2-D attention score matrix via inter-frame regularization to measure the importances of spatial parts across different frames. Thus, a more robust clip-level feature representation can be generated according to a weighted sum operation guided by the mined 2-D attention score matrix. In this way, the challenging cases for video-based person re-identification such as pose variation and partial occlusion can be well tackled by the STA. We conduct extensive experiments on two large-scale benchmarks, i.e. MARS and DukeMTMCVideoReID. In particular, the mAP reaches 87.7% on MARS, which significantly outperforms the state-of-the-arts with a large margin of more than 11.6%.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 81 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Video Is Worth Three Views: Trigeminal Transformers for Video-Based Person Re-Identification;IEEE Transactions on Intelligent Transportation Systems;2024-09

2. ConvTransformer Attention Network for temporal action detection;Knowledge-Based Systems;2024-09

3. Water Flow Prediction Based on Improved Spatiotemporal Attention Mechanism of Long Short-Term Memory Network;Water;2024-06-03

4. Situational diversity in video person re-identification: introducing MSA-BUPT dataset;Complex & Intelligent Systems;2024-05-23

5. Fusing LiDAR and Radar with Pillars Attention for 3D Object Detection;2024 7th International Symposium on Autonomous Systems (ISAS);2024-05-07