Author:
Pino Rodney,Mendoza Renier,Sambayan Rachelle
Abstract
Baybayin is a Tagalog-language writing system primarily used in the northern Philippines during the pre-Hispanic period. In 2018, the House of Representatives approved House Bill 1022 or the “National Writing System Act,” which declares the Baybayin script as the Philippines’ national writing system. Thus, documents, signages, books, etc. may soon have Baybayin texts. However, the Latin alphabet is still the primary script used in the country. Hence, it is possible that Latin and Baybayin scripts may be found on the same text. In this paper, we present an optical character recognition (OCR) system that identifies Baybayin scripts from Latin in a text image. The preprocessing method applies the conversion of the input image to binary data and calculating the respective bounding box of each word found from the text, where we utilize a modified 𝒌 − means algorithm and MATLAB ocr function, respectively. The classification then involves isolating each word and further segmenting each character’s components. With the aid of a support vector machine (SVM) character classifier, we determine the word’s script by the highest number of characters classified into either Baybayin or Latin. To the best of our knowledge, this is the first system that discriminates, at the block level, the Baybayin script from Latin. The proposed algorithm yields a 93.64% recognition accuracy tested in a novel dataset. The accompanying code of the proposed algorithm and the dataset are made publicly available to make the results of the study reproducible.
Publisher
Science and Technology Information Institute
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Aralin Baybayin: Intelligent Tutoring System for Baybayin Scripts with Character Recognition;Proceedings of the 2023 11th International Conference on Computer and Communications Management;2023-08-04
2. Optical Character Recognition of Baybayin Writing System using YOLOv3 Algorithm;2022 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET);2022-09-13