1. Alayrac, J.-B., Bojanowski, P., Agrawal, N., Sivic, J., Laptev, I., Lacoste-Julien, S., 2016. Unsupervised Learning from Narrated Instruction Videos. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition. CVPR, pp. 4575–4583.
2. Amac, M.S., Yagcioglu, S., Erdem, A., Erdem, E., 2019. Procedural Reasoning Networks for Understanding Multimodal Procedures. In: Proceedings of the 23rd Conference on Computational Natural Language Learning. CoNLL, pp. 441–451.
3. Banerjee, S., Lavie, A., 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/Or Summarization. pp. 65–72.
4. Bosselut, A., Celikyilmaz, A., He, X., Gao, J., Huang, P.-S., Choi, Y., 2018a. Discourse-Aware Neural Rewards for Coherent Text Generation. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). pp. 173–184.
5. Bosselut, A., Ennis, C., Levy, O., Holtzman, A., Fox, D., Choi, Y., 2018b. Simulating Action Dynamics with Neural Process Networks. In: International Conference on Learning Representations.