Adaptive Auto-encoder for Extraction of Arabic Text: invariant, font, and segment

Author:

zerdoumi saber1ORCID,Jhanjhi Noor Zaman2,Ahmed Riyaz3,Hashem Ibrahim Abaker Targio4,Gabralla Lubna Abdelkareim5

Affiliation:

1. Universite Constantine 2

2. Taylor's University - Lakeside Campus: Taylor's University

3. Taylor's University

4. University of Sharjah

5. Kingdom of Saudi Arabia

Abstract

Abstract Adaptive auto-en-codor research strategy for categorizing Arabic text into three components: invariant, font, and signature are discussed in this article. We began our investigation by studying pattern recognition methods. Using the collected data, a mathematical model for Arabic pattern recognition was created. Once the model has been created, it is used to generate the algorithm. Segmentation of area composite ligatures and open/closed characters was used to develop and test the algorithm's primary engine. The algorithm was tested to see if it could distinguish between text and other objects. The evaluation method is also described, which is based on a widely used benchmark data-set and a variety of other data sources. Word-level archiving's most critical feature is the ability to recognize each word as a separate unit and component with a consistency that allows the entire pixel to be comprehensively identified and its value to be adjusted. Figure 1 illustrates the detection of Arabic words in ads, as well as the subsequent determination of words after training and matching the algorithm. Using a vertical projection and a base line determination or automatic correction for each issue In the upper baseline or lower pass line with the respective center THAA that is generated using the pre-characteristics learning for Arabic writing, there may be existing or missing dots. On the basis of educational and descriptive value, these dots have been placed. BAA and YAA, for example, both contain dots above the baseline. If the sequence does not contain continuous curves from top to bottom, this value will not be considered. as Jim, it'll be decided. P1 to P10 consider the earlier zoning of cropping from right to left as an issue. Consequences of the alphabets' display Before we could even consider the exploratory form, our investigation was compelled by an important issue raised in this Arabic script. Because they are modified, the end points of P1 must be identified. P2 consists of novel elements that are connected and characterized as novel elements. It is necessary to alter the novel components in some way in order to identify them as errors in P2. It is necessary to make adjustments such as decreasing p3, increasing the distance from p4 to greater distances, decreasing weight, and changing the value of white to black, as well as white to white. You can find a wealth of useful Arabic-language content on p5. A overlapping zoning pattern results from cropping. This issue is addressed in the manual's P6 section. Using slop equations, connect all of the pixels in the image to form a single image (1). If the skew is not zero and the base line depends on the skew, the focus will be adjusted by reversing the absolute value of the skew value into the direction of contact. The diagram depicts the entire alphabet extraction process.

Publisher

Research Square Platform LLC

Reference88 articles.

1. Srihari, S. N., Shekhawat, A., & Lam, S. W. (2003). Optical character recognition (OCR).

2. A discriminative linear regression approach to adaptation of multi-prototype based classifiers and its applications for Chinese OCR;Du J;Pattern Recognition,2013

3. Handwritten English character recognition using neural network;Patil V;Elixir Comput Sci Eng,2011

4. Recognition of Bangla compound characters using structural decomposition;Bag S;Pattern Recognition,2014

5. Saber, Z., et al.., Efficient Approach to Segment Ligatures and Open Characters in Offline Arabic text.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3