Improving information extraction from visually rich documents using visual span representations-Reference-Cited by-同舟云学术

Improving information extraction from visually rich documents using visual span representations

Published:2021-01 Issue:5 Volume:14 Page:822-834
ISSN:2150-8097
Container-title:Proceedings of the VLDB Endowment
language:en
Short-container-title:Proc. VLDB Endow.

Author:

Sarkhel Ritesh¹,Nandi Arnab²

Affiliation:

1. The Ohio State Universtiy

2. The Ohio State University

Abstract

Along with textual content, visual features play an essential role in the semantics of visually rich documents. Information extraction (IE) tasks perform poorly on these documents if these visual cues are not taken into account. In this paper, we present Artemis - a visually aware, machine-learning-based IE method for heterogeneous visually rich documents. Artemis represents a visual span in a document by jointly encoding its visual and textual context for IE tasks. Our main contribution is two-fold. First, we develop a deep-learning model that identifies the local context boundary of a visual span with minimal human-labeling. Second, we describe a deep neural network that encodes the multimodal context of a visual span into a fixed-length vector by taking its textual and layout-specific features into account. It identifies the visual span(s) containing a named entity by leveraging this learned representation followed by an inference task. We evaluate Artemis on four heterogeneous datasets from different domains over a suite of information extraction tasks. Results show that it outperforms state-of-the-art text-based methods by up to 17 points in F1-score.

Publisher

VLDB Endowment

Subject

General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development

Link

https://dl.acm.org/doi/pdf/10.14778/3446095.3446104

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Self-Training for Label-Efficient Information Extraction from Semi-Structured Web-Pages;Proceedings of the VLDB Endowment;2023-07

2. Automatic Key Information Extraction from Visually Rich Documents;2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA);2022-12

3. Scalable and Cost-effective Serverless Architecture for Information Extraction Workflows;Proceedings of the 2nd Workshop on High Performance Serverless Computing;2022-06-27

4. CORE-SG: Efficient Computation of Multiple MSTs for Density-Based Methods;2022 IEEE 38th International Conference on Data Engineering (ICDE);2022-05