Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets-Reference-Cited by-同舟云学术

Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets

Published:2023-12-05 Issue: Volume: Page:
ISSN:0920-5691
Container-title:International Journal of Computer Vision
language:en
Short-container-title:Int J Comput Vis

Author:

Cornia Marcella^ORCID,Baraldi Lorenzo,Fiameni Giuseppe,Cucchiara Rita

Funder

Ministero dell’Istruzione, dell’Università e della Ricerca

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Link

https://link.springer.com/content/pdf/10.1007/s11263-023-01949-w.pdf

Reference82 articles.

1. Agrawal, H., Desai, K., Wang, Y., Chen, X., Jain, R., Johnson, M., Batra, D., Parikh, D., Lee, S., & Anderson, P. (2019). Nocaps: Novel object captioning at scale. In Proceedings of the IEEE/CVF international conference on computer vision.

2. Alayrac, J. B., Donahue, J., Luc, P., Miech, A., Barr, I., Hasson, Y., Lenc, K., Mensch, A., Millican, K., Reynolds, M., & Ring, R. (2022). Flamingo: A visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35, 23716–23736.

3. Anderson, P., Fernando, B., Johnson, M., & Gould, S. (2016). SPICE: Semantic propositional image caption evaluation. in Proceedings of the European conference on computer vision.

4. Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., & Zhang, L. (2018). Bottom-up and top-down attention for image captioning and visual question answering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

5. Banerjee, S., & Lavie, A. (2005). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the annual meeting of the association for computational linguistics workshops.