Author:
van Sonsbeek Tom,Derakhshani Mohammad Mahdi,Najdenkoska Ivona,Snoek Cees G. M.,Worring Marcel
Publisher
Springer Nature Switzerland
Reference35 articles.
1. Barraco, M., Cornia, M., Cascianelli, S., Baraldi, L., Cucchiara, R.: The unreasonable effectiveness of CLIP features for image captioning: an experimental analysis. In: CVPR, pp. 4662–4670 (2022)
2. Brown, T., et al.: Language models are few-shot learners. NeurIPS 33, 1877–1901 (2020)
3. Cong, F., Xu, S., Guo, L., Tian, Y.: Caption-aware medical VQA via semantic focusing and progressive cross-modality comprehension. In: ACM Multimedia, pp. 3569–3577 (2022)
4. Derakhshani, M.M., et al.: Variational prompt tuning improves generalization of vision-language models. arXiv:2210.02390 (2022)
5. Do, T., Nguyen, B.X., Tjiputra, E., Tran, M., Tran, Q.D., Nguyen, A.: Multiple meta-model quantifying for medical visual question answering. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12905, pp. 64–74. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87240-3_7
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献