Funder
National Science Foundation of China
Liaoning Revitalization Talents Program
The Scientific Research Project of Liaoning Province(
Key R&D projects of Liaoning Provincial Department of Science and Technology
Liaoning Provincial Key Laboratory Special Fund
Đại học Kinh tế Thành phố Hồ Chí Minh
Publisher
Springer Science and Business Media LLC
Reference46 articles.
1. Zhou L, Palangi H, Zhang L, Corso J, Gao J (2020) Unified vision-language pre-training for image captioning and vqa. Proc AAAI Confer Artif Intell 34(07):13041–13049
2. Hu X, Gan Z, Wang J, Yang Z, Liu Z, Lu Y, Wang L (2022) Scaling up vision-language pre-training for image captioning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. IEEE, pp 17980–17989
3. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, p 30
4. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805
5. Luo Y, Ji J, Sun X, Cao L, Wu Y, Huang F, Lin C-W, Ji R (2021) Dual-level collaborative transformer for image captioning. Proc AAAI Confer Artif Intell 35(3):2286–2293