1. Triantafyllos Afouras , Joon Son Chung , Andrew Senior, Oriol Vinyals, and Andrew Zisserman. 2018 . Deep audio-visual speech recognition. IEEE transactions on pattern analysis and machine intelligence, Vol. 44 , 12 (2018), 8717--8727. Triantafyllos Afouras, Joon Son Chung, Andrew Senior, Oriol Vinyals, and Andrew Zisserman. 2018. Deep audio-visual speech recognition. IEEE transactions on pattern analysis and machine intelligence, Vol. 44, 12 (2018), 8717--8727.
2. Invariance principle meets information bottleneck for out-of-distribution generalization;Ahuja Kartik;Advances in Neural Information Processing Systems,2021
3. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
4. VQA: Visual Question Answering
5. Martin Arjovsky , Léon Bottou , Ishaan Gulrajani , and David Lopez-Paz . 2019. Invariant risk minimization. arXiv preprint arXiv:1907.02893 ( 2019 ). Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. 2019. Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019).