Classification of Text and Non-Text from Bilingual Real-Time Document Using Deep Learning Approach

Author:

G SHIVAKUMAR1,M Ravikumar1,J Shivaprasad B2

Affiliation:

1. Kuvempu University

2. Srinivasa Institute of Technology

Abstract

Abstract In this work, we have presented an efficient approach for classification of text and non-text document information from real time office documents images which are bilingual using a deep learning approach i.e., U-net architecture for experimentation purpose. We have created our own dataset containing 2000 document images. Initially pre-processing is applied on the input document images proposed method is compared with other existing methods and obtained accuracy of 99.62% different performance measure i.e., (Specificity, Sensitivity, Precision, F1-Score) used in the experimentation.

Publisher

Research Square Platform LLC

Reference66 articles.

1. C.P. Chaithanya, N. Manohar, Ajay Bazil Issac, Automatic Text Detection and Classification in Natural Images, International Journal of Recent Technology and Engineering (IJRTE), Volume-7, Issue-5S3, pp. 176–180, 2019.

2. Separation of text and non-text in document layout analysis using a recursive filter;Tran TA;KSII Transactions on Internet and Information Systems (TIIS),2015

3. Arvind, K. R., Pati, P. B., & Ramakrishnan, A. G. (2006). Automatic text block separation in document images. In 2006 Fourth International Conference on Intelligent Sensing and Information Processing (pp. 53–58). IEEE.

4. Text/non-text separation from handwritten document images using LBP based features: An empirical study;Ghosh S;Journal of Imaging,2018

5. Puri, S., & Singh, S. P. (2016, January). Text recognition in bilingual machine printed image documents—Challenges and survey: A review on principal and crucial concerns of text extraction in bilingual printed images. In 2016 10th International Conference on Intelligent Systems and Control (ISCO) (pp. 1–8). IEEE.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3