1. Spice: Semantic propositional image caption evaluation;Anderson,2016
2. Bottom-up and top-down attention for image captioning and visual question answering;Anderson,2018
3. CaMEL: Mean Teacher Learning for Image Captioning;Barraco,2022
4. Language models are few-shot learners[J];Brown;Advances in Neural Information Processing Systems,2020
5. Improving image captioning with Pyramid Attention and SC-GAN[J];Chen;Image and Vision Computing,2022