Abstract
AbstractTable detection is an essential step in many document analysis systems. Tabular data are a pivotal form of information representation that can organize data in a conventional structure for comfortable and quick information retrieval and comparison. Detection of table structures in PDF files or images is a challenging task because of the variability of table layouts, and sometimes the tabular structures’ similarities with non-tabular elements like charts, plots, etc. In this work, we have presented a table detection method using a geometric analysis of the table cell cores that represents the table cell texts. The proposed method works by analyzing the text gap information, and hence it can detect the table cell cores, irrespective of the presence of the table boundary lines and cell-separating rule-lines. Experimentations have been done on various document images of complex structures from well-known datasets. The detection accuracies obtained by us corroborate the usefulness of the proposed method.
Funder
FHNW University of Applied Sciences and Arts Northwestern Switzerland
Publisher
Springer Science and Business Media LLC
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献