UTTSR: A Novel Non-Structured Text Table Recognition Model Powered by Deep Learning Technology-Reference-Cited by-同舟云学术

UTTSR: A Novel Non-Structured Text Table Recognition Model Powered by Deep Learning Technology

Published:2023-06-27 Issue:13 Volume:13 Page:7556
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Li Min¹²,Zhang Liping¹²,Zhou Mingle¹²,Han Delong¹²

Affiliation:

1. Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China

2. Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China

Abstract

To prevent the compilation of documents, many table documents are formatted with non-editable and non-structured texts such as PDFs or images. Quickly recognizing the contents of tables is still a challenge due to factors such as irregular formats, uneven text quality, and complex and diverse table content. This article proposes the UTTSR table recognition model, which consists of four parts: text region detection, text line detection and recognition, and table sequence recognition. For table detection, the Cascade Faster RCNN with the ResNeXt105 network is implemented, using TPS (Thin Plate Spline) transformation and affine transformation to correct the image and to improve accuracy. For text line detection, DBNET is used with Do-Conv in FPN (Feature Pyramid Networks) to speed up training. Text lines are recognized using CRNN without the CTC module, enhancing recognition performance. Table sequence recognition is based on the transformer combined with post-processing algorithms that fuse table structure sequences and unit grid content. Experimental results show that the UTTSR model outperforms the compared methods. This upgraded model significantly improves the accuracy of the previous state-of-the-art F1 score on complex tables, reaching 97.8%.

Funder

Shandong Provincial Natural Science Foundation

the Pilot Project for Integrated Innovation of Science, Education, and Industry of Qilu University of Technology

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/13/7556/pdf

Reference38 articles.

1. End-to-end table structure recognition and extraction in heterogeneous documents;Kashinath;Appl. Soft Comput.,2022

2. Prasad, D., Gadpal, A., Kapadni, K., Visave, M., and Sultanpure, K. (2020, January 14–19). CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.

3. Structure recognition methods for various types of documents;Watanabe;Mach. Vis. Appl.,1993

4. Hirayama, Y. (1995, January 14–16). A method for table structure analysis using DP matching. Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada.

5. Ramel, J.Y., Crucianu, M., Vincent, N., and Faure, C. (2003, January 6). Detection, extraction and representation of tables. Proceedings of the Seventh International Conference on Document Analysis and Recognition, Edinburgh, UK.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Text recognition using improved dual attention based on textual double embedding network with aquila optimization algorithm;International Journal of Information Technology;2024-06-14