Syntax-Controllable Video Captioning with Tree-Structural Syntax Augmentation-Reference-Cited by-同舟云学术

Syntax-Controllable Video Captioning with Tree-Structural Syntax Augmentation

Published:2024-04-26 Issue: Volume:14 Page:1-7
ISSN:
Container-title:Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition
language:
Short-container-title:

Author:

Sun Jiahui¹^ORCID,Song Peipei²^ORCID,Zhang Jing¹^ORCID,Guo Dan¹^ORCID

Affiliation:

1. Hefei University of Technology, School of Computer Science and Information Engineering, China

2. University of Science and Technology of China, Department of Electronic Engineering and Information Science, China

Funder

the National Natural Science Foundation of China

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3663976.3664004

Reference40 articles.

1. Peter Anderson, Basura Fernando, Mark Johnson, and Stephen Gould. 2017. Guided Open Vocabulary Image Captioning with Constrained Beam Search. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Copenhagen, Denmark, 936–945.

2. Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, 65–72.

3. Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Ge Xuri, Yongjian Wu, Feiyue Huang, and Yan Wang. 2019. Variational Structured Semantic Inference for Diverse Image Captioning. Neural Information Processing Systems,Neural Information Processing Systems 25 (2019).

4. Qi Chen, Chaorui Deng, and Qi Wu. 2022. Learning Distinct and Representative Modes for Image Captioning. ArXiv abs/2209.08231 (2022).

5. Yangyu Chen, Shuhui Wang, Weigang Zhang, and Qingming Huang. 2018. Less Is More: Picking Informative Frames for Video Captioning. In Computer Vision – ECCV 2018. Springer International Publishing, Cham, 367–384.