OTTOMAN CHARACTER RECOGNITION ON PRINTED DOCUMENTS USING DEEP LEARNING

Author:

Demir Ali Alper1ORCID,Ozkaya Ufuk1ORCID

Affiliation:

1. SÜLEYMAN DEMİREL ÜNİVERSİTESİ, MÜHENDİSLİK FAKÜLTESİ

Abstract

In this study, a deep learning-based method is developed for character detection and recognition in printed Ottoman documents. The character detection and recognition problem are considered as an object detection problem and for this purpose, an Ottoman character recognition model is developed based on the YOLO model, which is one of the most successful methods in object detection. In addition, in this study, a dataset consisting of Ottoman document images is created in which each character in the document images is marked. Data augmentation techniques are applied to improve the accuracy of character recognition and the robustness of the method. The Ottoman character recognition network was then trained using this dataset. The trained network model was tested with the test images in the dataset. The performance evaluation of the model was performed by calculating the average precision metric, which is frequently used in the literature. The average precision value was calculated for 34 character classes in the dataset and the results were interpreted in terms of the pros and cons of the method. The results show that the proposed method can detect and recognize characters in printed Ottoman documents with great accuracy, with a weighted average precision of 98.71%.

Publisher

Muhendislik Bilimleri ve Tasarim Dergisi

Reference24 articles.

1. Altun, H. O. (2022). Osmanlı Türkçesi araştırmalarında optik karakter tanıma teknolojisinin kullanımı. Başkent 3. Uluslararası Multidisipliner Bilimsel Çalışmalar Kongresi, 23-25 Eylül 2022.

2. Bilgin Tasdemir, E. F. (2023). Printed Ottoman text recognition using synthetic data and data augmentation. International Journal on Document Analysis and Recognition (IJDAR), 1-15.

3. Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934.

4. Doğru, M. (2016). Ottoman-Turkish Optical Character Recognition and Latin Transcription (Master's thesis, Ankara Yıldırım Beyazıt Üniversitesi Fen Bilimleri Enstitüsü).

5. Dölek, İ., & Kurt, A. (2023). Derin Sinir Ağlarıyla Osmanlıca Optik Karakter Tanıma. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, 38(4), 2579-2594.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3