Funder
National Research Foundation of Korea
Reference54 articles.
1. Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text;Akbari;Advances in Neural Information Processing Systems,2021
2. The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing
3. wav2vec 2.0: A framework for self-supervised learning of speech representations;Baevski;Advances in Neural Information Processing Systems,2020
4. Condensed Movies: Story Based Retrieval with Contextual Embeddings
5. MovieCLIP: Visual Scene Recognition in Movies
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献