Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers-Reference-Cited by-同舟云学术

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Published:2021-01-19 Issue:HistoInformatics Volume:HistoInformatics Page:
ISSN:2416-5999
Container-title:Journal of Data Mining & Digital Humanities
language:en
Short-container-title:

Author:

Barman Raphaël,Ehrmann Maud,Clematide Simon,Oliveira Sofia Ares,Kaplan Frédéric

Abstract

The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration. Research work seeking to automatically process facsimiles and extract information thereby are multiplying with, as a first essential step, document layout analysis. If the identification and categorization of segments of interest in document images have seen significant progress over the last years thanks to deep learning techniques, many challenges remain with, among others, the use of finer-grained segmentation typologies and the consideration of complex, heterogeneous documents such as historical newspapers. Besides, most approaches consider visual features only, ignoring textual signal. In this context, we introduce a multimodal approach for the semantic segmentation of historical newspapers that combines visual and textual features. Based on a series of experiments on diachronic Swiss and Luxembourgish newspapers, we investigate, among others, the predictive power of visual and textual features and their capacity to generalize across time and sources. Results show consistent improvement of multimodal models in comparison to a strong visual baseline, as well as better robustness to high material variance.

Publisher

Centre pour la Communication Scientifique Directe (CCSD)

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. M2SH: A Hybrid Approach to Table Structure Recognition using Two-Stage Multi-Modality Feature Fusion;2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC);2023-10-01

2. Review of Semi-Structured Document Information Extraction Techniques Based on Deep Learning;2023 2nd International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM);2023-07-25

3. A Study of COVID-19 and Its Detection Methods Using Imaging Techniques;Futuristic Communication and Network Technologies;2023

4. A new fusion of whale optimizer algorithm with Kapur’s entropy for multi-threshold image segmentation: analysis and validations;Artificial Intelligence Review;2022-03-21

5. Layout Aware Semantic Element Extraction for Sustainable Science & Technology Decision Support;Sustainability;2022-02-28