1. K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition (2014). arXiv preprint arXiv:1409.1556.
2. S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C.L. Zitnick, D. Parikh, Vqa: Visual question answering, in Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 2425–2433
3. M. Malinowski, M. Rohrbach, M. Fritz, Ask your neurons: a neural-based approach to answering questions about images, in Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 1–9
4. M. Ren, R. Kiros, R. Zemel, Exploring models and data for image question answering. Adv. Neural. Inf. Process. Syst. 28, 2953–2961 (2015)
5. R. Kiros, Y. Zhu, R.R. Salakhutdinov, R. Zemel, R. Urtasun, A. Torralba, S. Fidler, Skip-thought vectors, in Advances in Neural Information Processing Systems (2015), pp. 3294–3302)