1. Anderson, P., Fernando, B., Johnson, M., & Gould, S. (2016). SPICE: Semantic propositional image caption evaluation. In Proceedings of the European conference on computer vision.
2. Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., & Zhang, L. (2018). Bottom-up and top-down attention for image captioning and visual question answering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.
3. Arjovsky, M., Shah, A., & Bengio, Y. (2016). Unitary evolution recurrent neural networks. In Proceedings of the international conference on machine learning.
4. Arpit, D., Kanuparthi, B., Kerg, G., Ke, N. R., Mitliagkas, I., & Bengio, Y. (2019). h-detach: Modifying the LSTM Gradient Towards Better Optimization. In Proceedings of the international conference on learning representations.
5. Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Proceedings of the international conference on learning representations.