Abstract
In the proposed paper we introduce a new Pashtu numerals dataset having handwritten scanned images. We make the dataset publically available for scientific and research use. Pashtu language is used by more than fifty million people both for oral and written communication, but still no efforts are devoted to the Optical Character Recognition (OCR) system for Pashtu language. We introduce a new method for handwritten numerals recognition of Pashtu language through the deep learning based models. We use convolutional neural networks (CNNs) both for features extraction and classification tasks. We assess the performance of the proposed CNNs based model and obtained recognition accuracy of 91.45%.
Publisher
Balochistan University of Information Technology, Engineering and Management Sciences
Reference24 articles.
1. 1. P. A. Stubberud, J. Kanai, and V. Kalluri, “Improving optical character recognition accuracy using adaptive image restoration,” Journal of Electronic Imaging vol. 5, no. 3, pp. 379–388, 1996.
2. 2. Y. Du, C.-I. Chang, and P. D. Thouin, “Automated system for text detection in individual video images,” Journal of Electronic Imaging vol. 12, no. 3, pp. 410–423, 2003.
3. 3. Q. Ye and D. Doermann, “Text detection and recognition in imagery: A survey,” IEEE transactions on pattern analysis and machine intelligence vol. 37, no. 7, pp. 1480–1500, 2015.
4. 4. H. P. VC, “Method and means for recognizing complex patterns,”. US Patent 3,069,654, 1962 .
5. 5. H. Penzl and I. Sloan, “A Grammar of Pashto: A Descriptive Study of the Dialect of Kandahar, Afghanistan”. Ishi Press, 2009.