Author:
Yan Peixin,Li Zuoyong,Hu Rong,Cao Xinrong
Funder
National Natural Science Foundation of China
Natural Science Foundation of Fujian Province, China
Project of the 14th Five Year Plan of Education Studies, Fujian Province
Key Project of Educational Reform in Minjiang University
Project of The Development of Core Values throughout the Curriculum in Minjiang University
Humanities and Social Science Fund of the Ministry of Education
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Hardware and Architecture,Media Technology,Information Systems,Software
Reference51 articles.
1. Li, J., Wang, Y., Zhao, D.: Layer-wise enhanced transformer with multi-modal fusion for image caption. Multimedia Syst. 29(3), 1043–1056 (2023)
2. Carmo Nogueira, T., Vinhal, C.D.N., Cruz Júnior, G., Ullmann, M.R.D., Marques, T.C.: A reference-based model using deep learning for image captioning. Multimedia Syst. 29(3), 1665–1681 (2023)
3. Wei, J., Li, Z., Zhu, J., Ma, H.: Enhance understanding and reasoning ability for image captioning. Appl. Intell. 53(3), 2706–2722 (2023)
4. Lian, Z., Zhang, Y., Li, H., Wang, R., Hu, X.: Cross modification attention-based deliberation model for image captioning. Appl. Intell. 53(5), 5910–5933 (2023)
5. Zhang, X., Sun, X., Luo, Y., Ji, J., Zhou, Y., Wu, Y., Huang, F., Ji, R.: RSTNet: captioning with adaptive attention on visual and non-visual words. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 15465–15474. IEEE (2021)