Show, tell and rectify: Boost image caption generation via an output rectifier
-
Published:2024-06
Issue:
Volume:585
Page:127651
-
ISSN:0925-2312
-
Container-title:Neurocomputing
-
language:en
-
Short-container-title:Neurocomputing
Author:
Ge Guowei,
Han Yufeng,
Hao Lingguang,
Hao KuangrongORCID,
Wei BingORCID,
Tang Xue-songORCID
Reference49 articles.
1. From show to tell: A survey on deep learning-based image captioning;Stefanini;IEEE Trans. Pattern Anal. Mach. Intell.,2022
2. P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, L. Zhang, Bottom-up and top-down attention for image captioning and visual question answering, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6077–6086.
3. Y. Pan, T. Yao, Y. Li, T. Mei, X-linear attention networks for image captioning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10971–10980.
4. Relation constraint self-attention for image captioning;Ji;Neurocomputing,2022
5. MAENet: A novel multi-head association attention enhancement network for completing intra-modal interaction in image captioning;Hu;Neurocomputing,2023