Cascade Network with Deformable Composite Backbone for Formula Detection in Scanned Document Images-Reference-Cited by-同舟云学术

Cascade Network with Deformable Composite Backbone for Formula Detection in Scanned Document Images

Published:2021-08-19 Issue:16 Volume:11 Page:7610
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Hashmi Khurram Azeem^ORCID,Pagani Alain,Liwicki Marcus^ORCID,Stricker Didier,Afzal Muhammad Zeshan^ORCID

Abstract

This paper presents a novel architecture for detecting mathematical formulas in document images, which is an important step for reliable information extraction in several domains. Recently, Cascade Mask R-CNN networks have been introduced to solve object detection in computer vision. In this paper, we suggest a couple of modifications to the existing Cascade Mask R-CNN architecture: First, the proposed network uses deformable convolutions instead of conventional convolutions in the backbone network to spot areas of interest better. Second, it uses a dual backbone of ResNeXt-101, having composite connections at the parallel stages. Finally, our proposed network is end-to-end trainable. We evaluate the proposed approach on the ICDAR-2017 POD and Marmot datasets. The proposed approach demonstrates state-of-the-art performance on ICDAR-2017 POD at a higher IoU threshold with an f1-score of 0.917, reducing the relative error by 7.8%. Moreover, we accomplished correct detection accuracy of 81.3% on embedded formulas on the Marmot dataset, which results in a relative error reduction of 30%.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/11/16/7610/pdf

Reference55 articles.

1. Optical recognition of printed mathematical documents;Inoue;Proc. Third Asian Technol. Conf. Math,1998

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Mathematical formula detection in document images: A new dataset and a new approach;Pattern Recognition;2024-04

2. Towards End-to-End Semi-supervised Table Detection with Semantic Aligned Matching Transformer;Lecture Notes in Computer Science;2024

3. A Hybrid Approach for Document Layout Analysis in Document Images;Lecture Notes in Computer Science;2024

4. End to End Table Transformer;Lecture Notes in Computer Science;2024

5. UnSupDLA: Towards Unsupervised Document Layout Analysis;Lecture Notes in Computer Science;2024