PSNet: position-shift alignment network for image caption-Reference-Cited by-同舟云学术

PSNet: position-shift alignment network for image caption

Published:2023-11-27 Issue:2 Volume:12 Page:
ISSN:2192-6611
Container-title:International Journal of Multimedia Information Retrieval
language:en
Short-container-title:Int J Multimed Info Retr

Author:

Xue Lixia,Zhang Awen,Wang Ronggui,Yang Juan

Funder

National Natural Science Foundation of China

the National Key R &D Program of China

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Media Technology,Information Systems

Link

https://link.springer.com/content/pdf/10.1007/s13735-023-00307-3.pdf

Reference58 articles.

1. Mitchell M, Dodge J, Goyal A, Yamaguchi K, Stratos K, Han X, Mensch AC, Berg AC, Berg TL, Daumé H (2012) Midge: generating image descriptions from computer vision detections. In: Conference of the European chapter of the association for computational linguistics

2. Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9

3. Farhadi A, Hejrati M, Sadeghi MA, Young P, Rashtchian C, Hockenmaier J, Forsyth DA (2010) Every picture tells a story: generating sentences from images. In: European conference on computer vision

4. Gupta A, Verma Y, Jawahar CV (2012) Choosing linguistics over vision to describe images. In: Proceedings of the AAAI conference on artificial intelligence

5. Vinyals O, Toshev A, Bengio S, Erhan D (2014) Show and tell: a neural image caption generator. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3156–3164

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MGTANet: Multi-Scale Guided Token Attention Network for Image Captioning;Proceedings of the 2024 3rd International Conference on Cyber Security, Artificial Intelligence and Digital Economy;2024-03