PDF-VQA: A New Dataset for Real-World VQA on PDF Documents-Reference-Cited by-同舟云学术

PDF-VQA: A New Dataset for Real-World VQA on PDF Documents

Published:2023 Issue: Volume: Page:585-601
ISSN:0302-9743
Container-title:Lecture Notes in Computer Science
language:
Short-container-title:

Author:

Ding Yihao,Luo Siwen,Chung Hyunsuk,Han Soyeon Caren

Publisher

Springer Nature Switzerland

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-43427-3_35

Reference35 articles.

1. Antol, S., et al.: Vqa: visual question answering. In: Proceedings of the IEEE international Conference on Computer Vision, pp. 2425–2433 (2015)

2. Biten, A.F., et al.: Scene text visual question answering. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4291–4301 (2019)

3. Chaudhry, R., Shekhar, S., Gupta, U., Maneriker, P., Bansal, P., Joshi, A.: Leaf-qa: locate, encode & attend for figure question answering. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3512–3521 (2020)

4. Lecture Notes in Computer Science;B Davis,2021

5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 4171–4186 (2019)

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. On Leveraging Multi-Page Element Relations in Visually-Rich Documents;2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC);2024-07-02

2. Survey of Multimodal Medical Question Answering;BioMedInformatics;2023-12-31