Funder
the Science and Technology Project in Xi’an
Publisher
Springer Science and Business Media LLC
Reference53 articles.
1. Anderson, P., He, X., Buehler, C., et al.: Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6077–6086 (2018)
2. Gao, L.L., Li, X.P., Song, J.K., Shen, H.T.: Hierarchical LSTMs with adaptive attention for visual captioning. IEEE Trans. Pattern Anal. Mach. Intell. 42(5), 1112–1131 (2020).
3. Zhang, X., Sun, X., Luo, Y., et al.: RSTNet: Captioning with adaptive attention on visual and non-visual words. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New York (2021). https://doi.org/10.1109/CVPR46437.2021.01521.
4. Cui, H., Zhu, L., Li, J.J., Yang, Y., Nie, L.Q.: Scalable deep hashing for large scale social image retrieval. IEEE Trans. Image Process. 29, 1271–1284 (2019)
5. Wu, S., Wieland, J., Farivar, O., et al.: Automatic Alt-text: computer-generated image descriptions for blind users on a social network service. In: The 2017 ACM Conference. ACM, New York (2017). https://doi.org/10.1145/2998181.2998364.