Few-Shot Pixel-Precise Document Layout Segmentation via Dynamic Instance Generation and Local Thresholding-Reference-Cited by-同舟云学术

Few-Shot Pixel-Precise Document Layout Segmentation via Dynamic Instance Generation and Local Thresholding

Published:2023-08-10 Issue:10 Volume:33 Page:
ISSN:0129-0657
Container-title:International Journal of Neural Systems
language:en
Short-container-title:Int. J. Neur. Syst.

Author:

De Nardin Axel¹,Zottin Silvia¹,Piciarelli Claudio¹,Colombi Emanuela²,Foresti Gian Luca¹

Affiliation:

1. Department of Mathematics, Computer Science and Physics, Università degli Studi di Udine, Via delle Scienze 206, 33100 Udine, Italy

2. Department of Humanities and Cultural Heritage, Università degli Studi di Udine, Vicolo Florio 2/b, 33100 Udine, Italy

Abstract

Over the years, the humanities community has increasingly requested the creation of artificial intelligence frameworks to help the study of cultural heritage. Document Layout segmentation, which aims at identifying the different structural components of a document page, is a particularly interesting task connected to this trend, specifically when it comes to handwritten texts. While there are many effective approaches to this problem, they all rely on large amounts of data for the training of the underlying models, which is rarely possible in a real-world scenario, as the process of producing the ground truth segmentation task with the required precision to the pixel level is a very time-consuming task and often requires a certain degree of domain knowledge regarding the documents at hand. For this reason, in this paper, we propose an effective few-shot learning framework for document layout segmentation relying on two novel components, namely a dynamic instance generation and a segmentation refinement module. This approach is able of achieving performances comparable to the current state of the art on the popular Diva-HisDB dataset, while relying on just a fraction of the available data.

Funder

Piano Nazionale di Ripresa e Resilienza

Publisher

World Scientific Pub Co Pte Ltd

Subject

Computer Networks and Communications,General Medicine

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0129065723500521

Reference47 articles.

1. Document image analysis: A primer

2. Optical character recognition with neural networks and post-correction with finite state methods

3. Deep Learning for Historical Document Analysis and Recognition—A Survey

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. In-domain versus out-of-domain transfer learning for document layout analysis;International Journal on Document Analysis and Recognition (IJDAR);2024-08-19

2. U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts;Neural Computing and Applications;2024-01-16

3. A One-Shot Learning Approach to Document Layout Segmentation of Ancient Arabic Manuscripts;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03

4. Is ImageNet Always the Best Option? An Overview on Transfer Learning Strategies for Document Layout Analysis;Lecture Notes in Computer Science;2024

5. ICDAR 2024 Competition on Few-Shot and Many-Shot Layout Segmentation of Ancient Manuscripts (SAM);Lecture Notes in Computer Science;2024