BENet: bi-directional enhanced network for image captioning-Reference-Cited by-同舟云学术

BENet: bi-directional enhanced network for image captioning

Published:2024-01-29 Issue:1 Volume:30 Page:
ISSN:0942-4962
Container-title:Multimedia Systems
language:en
Short-container-title:Multimedia Systems

Author:

Yan Peixin,Li Zuoyong,Hu Rong,Cao Xinrong

Funder

National Natural Science Foundation of China

Natural Science Foundation of Fujian Province, China

Project of the 14th Five Year Plan of Education Studies, Fujian Province

Key Project of Educational Reform in Minjiang University

Project of The Development of Core Values throughout the Curriculum in Minjiang University

Humanities and Social Science Fund of the Ministry of Education

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Hardware and Architecture,Media Technology,Information Systems,Software

Link

https://link.springer.com/content/pdf/10.1007/s00530-023-01230-7.pdf

Reference51 articles.

1. Li, J., Wang, Y., Zhao, D.: Layer-wise enhanced transformer with multi-modal fusion for image caption. Multimedia Syst. 29(3), 1043–1056 (2023)

2. Carmo Nogueira, T., Vinhal, C.D.N., Cruz Júnior, G., Ullmann, M.R.D., Marques, T.C.: A reference-based model using deep learning for image captioning. Multimedia Syst. 29(3), 1665–1681 (2023)

3. Wei, J., Li, Z., Zhu, J., Ma, H.: Enhance understanding and reasoning ability for image captioning. Appl. Intell. 53(3), 2706–2722 (2023)

4. Lian, Z., Zhang, Y., Li, H., Wang, R., Hu, X.: Cross modification attention-based deliberation model for image captioning. Appl. Intell. 53(5), 5910–5933 (2023)

5. Zhang, X., Sun, X., Luo, Y., Ji, J., Zhou, Y., Wu, Y., Huang, F., Ji, R.: RSTNet: captioning with adaptive attention on visual and non-visual words. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 15465–15474. IEEE (2021)