Deterministic Routing between Layout Abstractions for Multi-Scale Classification of Visually Rich Documents-Reference-Cited by-同舟云学术

Deterministic Routing between Layout Abstractions for Multi-Scale Classification of Visually Rich Documents

Published:2019-08 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Sarkhel Ritesh¹,Nandi Arnab¹

Affiliation:

1. Department of Computer Science and Engineering, The Ohio State University

Abstract

Classifying heterogeneous visually rich documents is a challenging task. Difficulty of this task increases even more if the maximum allowed inference turnaround time is constrained by a threshold. The increased overhead in inference cost, compared to the limited gain in classification capabilities make current multi-scale approaches infeasible in such scenarios. There are two major contributions of this work. First, we propose a spatial pyramid model to extract highly discriminative multi-scale feature descriptors from a visually rich document by leveraging the inherent hierarchy of its layout. Second, we propose a deterministic routing scheme for accelerating end-to-end inference by utilizing the spatial pyramid model. A depth-wise separable multi-column convolutional network is developed to enable our method. We evaluated the proposed approach on four publicly available, benchmark datasets of visually rich documents. Results suggest that our proposed approach demonstrates robust performance compared to the state-of-the-art methods in both classification accuracy and total inference turnaround.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. DocXclassifier: towards a robust and interpretable deep neural network for document image classification;International Journal on Document Analysis and Recognition (IJDAR);2024-06-25

2. DWT-CompCNN: deep image classification network for high throughput JPEG 2000 compressed documents;Pattern Analysis and Applications;2023-08-02

3. Document Image Analysis Using Deep Multi-modular Features;SN Computer Science;2022-10-15

4. mmLayout: Multi-grained MultiModal Transformer for Document Understanding;Proceedings of the 30th ACM International Conference on Multimedia;2022-10-10

5. DiT: Self-supervised Pre-training for Document Image Transformer;Proceedings of the 30th ACM International Conference on Multimedia;2022-10-10