1. VQA: Visual Question Answering
2. Learning to paraphrase: An unsupervised approach using multiple-sequence alignment;Barzilay Regina;CoRR,2003
3. Ali Furkan Biten, Lluís Gómez, Marçal Rusiñol, and Dimosthenis Karatzas. 2019. Good news, everyone! Context driven entity-aware captioning for news images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12466–12475.
4. Hui Chen, Guiguang Ding, Xudong Liu, Zijia Lin, Ji Liu, and Jungong Han. 2020. IMRAM: Iterative matching with recurrent attention memory for cross-modal image-text retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12652–12660.
5. Empirical evaluation of gated recurrent neural networks on sequence modeling;Chung Junyoung;CoRR,2014