Video Frame-wise Explanation Driven Contrastive Learning for Procedural Text Generation-Reference-Cited by-同舟云学术

Video Frame-wise Explanation Driven Contrastive Learning for Procedural Text Generation

Published:2024-04 Issue: Volume:241 Page:103954
ISSN:1077-3142
Container-title:Computer Vision and Image Understanding
language:en
Short-container-title:Computer Vision and Image Understanding

Author:

Wang Zhihao,Li Lin,Xie Zhongwei,Liu Chuanbo

Funder

National Natural Science Foundation of China

Key Research and Development Program of Hunan Province of China

Publisher

Elsevier BV

Reference56 articles.

1. Alayrac, J.-B., Bojanowski, P., Agrawal, N., Sivic, J., Laptev, I., Lacoste-Julien, S., 2016. Unsupervised Learning from Narrated Instruction Videos. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition. CVPR, pp. 4575–4583.

2. Amac, M.S., Yagcioglu, S., Erdem, A., Erdem, E., 2019. Procedural Reasoning Networks for Understanding Multimodal Procedures. In: Proceedings of the 23rd Conference on Computational Natural Language Learning. CoNLL, pp. 441–451.

3. Banerjee, S., Lavie, A., 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/Or Summarization. pp. 65–72.

4. Bosselut, A., Celikyilmaz, A., He, X., Gao, J., Huang, P.-S., Choi, Y., 2018a. Discourse-Aware Neural Rewards for Coherent Text Generation. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). pp. 173–184.

5. Bosselut, A., Ennis, C., Levy, O., Holtzman, A., Fox, D., Choi, Y., 2018b. Simulating Action Dynamics with Neural Process Networks. In: International Conference on Learning Representations.