Author:
Zheng Hongyan,Sun Tao,Zhang Zhiping,Liu Hao,Liu Gengchen,Yang Zhi,Wang Xiaoyu
Publisher
Springer Nature Singapore
Reference22 articles.
1. Marino, K., Rastegari, M., Farhadi, A., Mottaghi, R.: OK-VQA: a visual question answering benchmark requiring external knowledge. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3195–3204 (2019)
2. Antol, S., et al.: VQA: visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2425–2433 (2015)
3. Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
4. Ding, Y., Yu, J., Liu, B., Hu, Y., Cui, M., Wu, Q.: MuKEA: multimodal knowledge extraction and accumulation for knowledge-based visual question answering. In: proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5089–5098 (2022)
5. Lin, Y., Xie, Y., Chen, D., Xu, Y., Zhu, C., Yuan, L.: Revive: regional visual representation matters in knowledge-based visual question answering. Adv. Neural. Inf. Process. Syst. 35, 10560–10571 (2022)