MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition-Reference-Cited by-同舟云学术

MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition

Published:2022-06 Issue: Volume: Page:
ISSN:
Container-title:2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
language:
Short-container-title:

Author:

Wu Chao-Yuan¹,Li Yanghao¹,Mangalam Karttikeya¹,Fan Haoqi¹,Xiong Bo¹,Malik Jitendra¹,Feichtenhofer Christoph¹

Affiliation:

1. Facebook AI Research

Publisher

IEEE

Link

Reference85 articles.

1. Temporal segment net-works: Towards good practices for deep action recognition;limin;Proc ECCV,0

2. Action recognition with trajectory-pooled deep-convolutional descriptors;limin;Proc CVPR,0

3. Evaluation of local spatio-temporal features for action recognition;heng;BMVC,2009

5. In-teractive prototype learning for egocentric action recognition;xiaohan;Proc ICCV,0

Cited by 69 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. STMixer: A One-Stage Sparse Action Detector;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-10

4. From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-08-28

5. A Survey on Backbones for Deep Video Action Recognition;2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW);2024-07-15