1. Aishwarya Agrawal , Jiasen Lu , Stanislaw Antol , Margaret Mitchell , Dhruv Batra , C Lawrence Zitnick , and Devi Parikh . 2017 . VQA: Visual Question Answering. In Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 21--29 . Aishwarya Agrawal, Jiasen Lu, Stanislaw Antol, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh. 2017. VQA: Visual Question Answering. In Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 21--29.
2. Peter Anderson , Xiaodong He , Chris Buehler , Damien Teney , Mark Johnson , Stephen Gould , and Lei Zhang . 2018 . Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. In Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 6077--6086 . Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. In Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 6077--6086.
3. VQA: Visual Question Answering
4. LaTr: Layout-Aware Transformer for Scene-Text VQA
5. ICDAR 2019 Competition on Scene Text Visual Question Answering