Adaptive Auto-encoder for Extraction of Arabic Text: invariant, font, and segment-Reference-Cited by-同舟云学术

Adaptive Auto-encoder for Extraction of Arabic Text: invariant, font, and segment

Published:2022-11-11 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

zerdoumi saber¹^ORCID,Jhanjhi Noor Zaman²,Ahmed Riyaz³,Hashem Ibrahim Abaker Targio⁴,Gabralla Lubna Abdelkareim⁵

Affiliation:

1. Universite Constantine 2

2. Taylor's University - Lakeside Campus: Taylor's University

3. Taylor's University

4. University of Sharjah

5. Kingdom of Saudi Arabia

Abstract

Abstract Adaptive auto-en-codor research strategy for categorizing Arabic text into three components: invariant, font, and signature are discussed in this article. We began our investigation by studying pattern recognition methods. Using the collected data, a mathematical model for Arabic pattern recognition was created. Once the model has been created, it is used to generate the algorithm. Segmentation of area composite ligatures and open/closed characters was used to develop and test the algorithm's primary engine. The algorithm was tested to see if it could distinguish between text and other objects. The evaluation method is also described, which is based on a widely used benchmark data-set and a variety of other data sources. Word-level archiving's most critical feature is the ability to recognize each word as a separate unit and component with a consistency that allows the entire pixel to be comprehensively identified and its value to be adjusted. Figure 1 illustrates the detection of Arabic words in ads, as well as the subsequent determination of words after training and matching the algorithm. Using a vertical projection and a base line determination or automatic correction for each issue In the upper baseline or lower pass line with the respective center THAA that is generated using the pre-characteristics learning for Arabic writing, there may be existing or missing dots. On the basis of educational and descriptive value, these dots have been placed. BAA and YAA, for example, both contain dots above the baseline. If the sequence does not contain continuous curves from top to bottom, this value will not be considered. as Jim, it'll be decided. P1 to P10 consider the earlier zoning of cropping from right to left as an issue. Consequences of the alphabets' display Before we could even consider the exploratory form, our investigation was compelled by an important issue raised in this Arabic script. Because they are modified, the end points of P1 must be identified. P2 consists of novel elements that are connected and characterized as novel elements. It is necessary to alter the novel components in some way in order to identify them as errors in P2. It is necessary to make adjustments such as decreasing p3, increasing the distance from p4 to greater distances, decreasing weight, and changing the value of white to black, as well as white to white. You can find a wealth of useful Arabic-language content on p5. A overlapping zoning pattern results from cropping. This issue is addressed in the manual's P6 section. Using slop equations, connect all of the pixels in the image to form a single image (1). If the skew is not zero and the base line depends on the skew, the focus will be adjusted by reversing the absolute value of the skew value into the direction of contact. The diagram depicts the entire alphabet extraction process.

Publisher

Research Square Platform LLC

Reference88 articles.

1. Srihari, S. N., Shekhawat, A., & Lam, S. W. (2003). Optical character recognition (OCR).

2. A discriminative linear regression approach to adaptation of multi-prototype based classifiers and its applications for Chinese OCR;Du J;Pattern Recognition,2013

3. Handwritten English character recognition using neural network;Patil V;Elixir Comput Sci Eng,2011

4. Recognition of Bangla compound characters using structural decomposition;Bag S;Pattern Recognition,2014

5. Saber, Z., et al.., Efficient Approach to Segment Ligatures and Open Characters in Offline Arabic text.