Author:
Peng Shixin,Xiong Can,Liu Leyuan,Yang Laurence T.,Chen Jingying
Funder
Hubei Provincial Natural Science Foundation of China
Research Funds of CCNU from the Colleges' Basic Research and Operation of MOE
National Key Research and Development Program of China
National Natural Science Foundation of China
Publisher
Springer Science and Business Media LLC
Reference50 articles.
1. Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2015.7298935
2. Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. International Conference on Machine Learning,International Conference on Machine Learning
3. Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
4. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, Kaiser L, Polosukhin I (2017) Attention is all you need. Neural Information Processing Systems, Neural Information Processing Systems
5. Cornia M, Stefanini M, Baraldi L, Cucchiara R (2020) Meshed-memory transformer for image captioning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr42600.2020.01059