Offline Pashto Characters Dataset for OCR Systems-Reference-Cited by-同舟云学术

Offline Pashto Characters Dataset for OCR Systems

Published:2021-07-27 Issue: Volume:2021 Page:1-7
ISSN:1939-0122
Container-title:Security and Communication Networks
language:en
Short-container-title:Security and Communication Networks

Author:

Khan Sulaiman¹²^ORCID,Khan Habib Ullah²^ORCID,Nazir Shah¹^ORCID

Affiliation:

1. Department of Computer Science, University of Swabi, Swabi, Pakistan

2. Department of Accounting and Information Systems, College of Business and Economics, Qatar University, Doha, Qatar

Abstract

In computer vision and artificial intelligence, text recognition and analysis based on images play a key role in the text retrieving process. Enabling a machine learning technique to recognize handwritten characters of a specific language requires a standard dataset. Acceptable handwritten character datasets are available in many languages including English, Arabic, and many more. However, the lack of datasets for handwritten Pashto characters hinders the application of a suitable machine learning algorithm for recognizing useful insights. In order to address this issue, this study presents the first handwritten Pashto characters image dataset (HPCID) for the scientific research work. This dataset consists of fourteen thousand, seven hundred, and eighty-four samples—336 samples for each of the 44 characters in the Pashto character dataset. Such samples of handwritten characters are collected on an A4-sized paper from different students of Pashto Department in University of Peshawar, Khyber Pakhtunkhwa, Pakistan. On total, 336 students and faculty members contributed in developing the proposed database accumulation phase. This dataset contains multisize, multifont, and multistyle characters and of varying structures.

Publisher

Hindawi Limited

Subject

Computer Networks and Communications,Information Systems

Link

http://downloads.hindawi.com/journals/scn/2021/3543816.pdf

Reference23 articles.

1. Printed Arabic Text Database for Automatic Recognition Systems

2. Survey for Databases On Arabic Off-line Handwritten Characters Recognition System

3. KNN and ANN-based Recognition of Handwritten Pashto Letters using Zoning Features

4. Pioneer dataset and recognition of Handwritten Pashto characters using Convolution Neural Networks

5. Pashto Characters Recognition Using Multi-Class Enabled Support Vector Machine

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Analysis of Cursive Text Recognition Systems: A Systematic Literature Review;ACM Transactions on Asian and Low-Resource Language Information Processing;2023-07-20

2. Pashto Handwritten Invariant Character Trajectory Prediction Using a Customized Deep Learning Technique;Sensors;2023-06-30

3. Recognition of Kannada Character Scripts Using Hybrid Feature Extraction and Ensemble Learning Approaches;Cybernetics and Systems;2023-03-09

4. Tamil Handwritten Character Recognition System using Statistical Algorithmic Approaches;Computer Speech & Language;2023-03

5. PHTI: Pashto Handwritten Text Imagebase for Deep Learning Applications;IEEE Access;2022