Author:
Turski Michał,Stanisławek Tomasz,Kaczmarek Karol,Dyda Paweł,Graliński Filip
Publisher
Springer Nature Switzerland
Reference33 articles.
1. Abadji, J., Suarez, P.O., Romary, L., Sagot, B.: Towards a cleaner document-oriented multilingual crawled corpus. ArXiv abs/2201.06642 (2022)
2. Allison, T., et al.: Research report: Building a wide reach corpus for secure parser development. In: 2020 IEEE Security and Privacy Workshops (SPW), pp. 318–326 (2020). https://doi.org/10.1109/SPW50608.2020.00066
3. Ammar, W., et al.: Construction of the literature graph in semantic scholar. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers), pp. 84–91. Association for Computational Linguistics, New Orleans - Louisiana (2018). https://doi.org/10.18653/v1/N18-3011. https://aclanthology.org/N18-3011
4. Biten, A.F., Tito, R., Gomez, L., Valveny, E., Karatzas, D.: OCR-IDR: OCR annotations for industry document library dataset. arXiv preprint arXiv:2202.12985 (2022)
5. Borchmann, Ł., et al.: DUE: end-to-end document understanding benchmark. In: NeurIPS Datasets and Benchmarks (2021)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Beyond Document Page Classification: Design, Datasets, and Challenges;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03