1. Roberta: A robustly optimized bert pretraining approach;liu;ArXiv Preprint,2019
2. Learning to answer visual questions from web videos;yang;IEEE Transactions on Pattern Analysis & Machine Intelligence,2022
3. Learning To Recognize Procedural Activities with Distant Supervision
4. Just Ask: Learning to Answer Questions from Millions of Narrated Videos
5. Clip4clip: An empirical study of clip for end to end video clip retrieval;luo;ArXiv Preprint,2021