1. Hassan Akbari , Linagzhe Yuan , Rui Qian , Wei-Hong Chuang , Shih-Fu Chang , Yin Cui , and Boqing Gong . 2021 . Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text. arXiv preprint arXiv:2104.11178 (2021). Hassan Akbari, Linagzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, and Boqing Gong. 2021. Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text. arXiv preprint arXiv:2104.11178 (2021).
2. Humam Alwassel Dhruv Mahajan Bruno Korbar Lorenzo Torresani Bernard Ghanem and Du Tran. 2020. Self-Supervised Learning by Cross-Modal Audio-Video Clustering. In NeurIPS. Humam Alwassel Dhruv Mahajan Bruno Korbar Lorenzo Torresani Bernard Ghanem and Du Tran. 2020. Self-Supervised Learning by Cross-Modal Audio-Video Clustering. In NeurIPS.
3. Lisa Anne Hendricks Oliver Wang Eli Shechtman Josef Sivic Trevor Darrell and Bryan Russell. 2017. Localizing moments in video with natural language. In ICCV. 5803--5812. Lisa Anne Hendricks Oliver Wang Eli Shechtman Josef Sivic Trevor Darrell and Bryan Russell. 2017. Localizing moments in video with natural language. In ICCV. 5803--5812.
4. Jinbin Bai Chunhui Liu Feiyue Ni Haofan Wang Mengying Hu Xiaofeng Guo and Lele Cheng. 2022. LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval. (2022). arxiv: 2207.04858 [cs.CV] Jinbin Bai Chunhui Liu Feiyue Ni Haofan Wang Mengying Hu Xiaofeng Guo and Lele Cheng. 2022. LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval. (2022). arxiv: 2207.04858 [cs.CV]
5. Yang Bai , Xiaoguang Li , Gang Wang , Chaoliang Zhang , Lifeng Shang , Jun Xu , Zhaowei Wang , Fangshan Wang , and Qun Liu . 2020. SparTerm: Learning Term-based Sparse Representation for Fast Text Retrieval. ArXiv , Vol. abs/ 2010 .00768 ( 2020 ). Yang Bai, Xiaoguang Li, Gang Wang, Chaoliang Zhang, Lifeng Shang, Jun Xu, Zhaowei Wang, Fangshan Wang, and Qun Liu. 2020. SparTerm: Learning Term-based Sparse Representation for Fast Text Retrieval. ArXiv, Vol. abs/2010.00768 (2020).