Abstract
The growing amount of data demands methods that can gradually learn from new samples. However, it is not trivial to continually train a network. Retraining a network with new data usually results in a phenomenon called “catastrophic forgetting”. In a nutshell, the performance of the model on the previous data drops by learning from the new instances. This paper explores this issue in the table detection problem. While there are multiple datasets and sophisticated methods for table detection, the utilization of continual learning techniques in this domain has not been studied. We employed an effective technique called experience replay and performed extensive experiments on several datasets to investigate the effects of catastrophic forgetting. The results show that our proposed approach mitigates the performance drop by 15 percent. To the best of our knowledge, this is the first time that continual learning techniques have been adopted for table detection, and we hope this stands as a baseline for future research.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference68 articles.
1. Improving Deep Learning Based Optical Character Recognition via Neural Architecture Search;Zhao,2020
2. Feedback Learning: Automating the Process of Correcting and Completing the Extracted Information;Hashmi;Proceedings of the International Conference on Document Analysis and Recognition Workshops (ICDARW),2019
3. Assessing the Impact of OCR Quality on Downstream NLP Tasks;van Strien;Proceedings of the 12th International Conference on Agents and Artificial Intelligence, 1 ICAART,2020
4. Tintin: A system for retrieval in text tables;Pyreddy;Proceedings of the 2nd ACM International Conference on Digital Libraries,1997
5. Cascade Network with Deformable Composite Backbone for Formula Detection in Scanned Document Images
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献