1. Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
2. Saeed Amizadeh , Hamid Palangi , Alex Polozov , Yichen Huang , and Kazuhito Koishida . 2020 . Neuro-symbolic visual reasoning: Disentangling . In International Conference on Machine Learning. PMLR, 279–290 . Saeed Amizadeh, Hamid Palangi, Alex Polozov, Yichen Huang, and Kazuhito Koishida. 2020. Neuro-symbolic visual reasoning: Disentangling. In International Conference on Machine Learning. PMLR, 279–290.
3. Peter Anderson , Basura Fernando , Mark Johnson , and Stephen Gould . 2016 . Spice: Semantic propositional image caption evaluation. In Computer Vision–ECCV 2016: 14th European Conference , Amsterdam, The Netherlands , October 11-14, 2016, Proceedings, Part V 14. Springer , 382–398. Peter Anderson, Basura Fernando, Mark Johnson, and Stephen Gould. 2016. Spice: Semantic propositional image caption evaluation. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part V 14. Springer, 382–398.
4. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering