1. Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video
2. Vlm: Task-agnostic video-language model pre-training for video understanding;hu;Findings of the Association for Computational Linguistics ACL-IJCNLP 2021,0
3. SlowFast Networks for Video Recognition
4. Video question answering via gradually refined attention over appearance and motion;dejing;Proceedings of the 25th ACM international conference on Multimedia,0
5. Anticipative Video Transformer