1. Deep learning for computer vision: A brief review;Voulodimos;Comput. Intell. Neurosci.,2018
2. Natural language processing;Chowdhary,2020
3. S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C.L. Zitnick, D. Parikh, Vqa: Visual question answering, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2425–2433.
4. S. Kottur, J.M. Moura, D. Parikh, D. Batra, M. Rohrbach, Visual coreference resolution in visual dialog using neural module networks, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 153–169.
5. Cops-ref: A new dataset and task on compositional referring expression comprehension;Chen,2020