Author:
Sainin Mohd Shamrie,Sintian Minah,Alias Suraya,Tahir Asni
Abstract
A computer-aided language translation using a Machine translation (MT) is an application performed by computers (machines) that translates one natural language to another. There are many online language translation tools, but thus far none offers a sequence of text translations for the under-resourced Kadazandusun language. Although there are web-based and mobile applications of Kadazandusun dictionaries available, the systems do not translate more than one word. Hence, this paper aims to present the discussion of the preliminary translation of Malay to Kadazandusun. The basic word-to-word with dictionary alignment translation based on Direct Machine Translation (DMT) is selected to begin the exploration of the translation domain where DMT is one of the earliest translation methods which relies on the word-to-word approach (sequence-to-sequence model). This paper aims to investigate the under-resourced language and the task of translating from the Malay language to the Kadazandusun language or vice versa. This paper presents the application and the process as well as the results of the system according to the basic Kadazandusun word arrangement (Verb-Subject-Object) and its translation quality using the Bilingual Evaluation Understudy (BLEU) score. Several phases are involved during the process, including data collection (word pair translation), preprocessing, text selection, translation procedures, and performance evaluation. The preliminary language translation approach is proven to be capable of producing up to 0.5 BLEU scores which indicate that the translation is readable, however, requires post-editing for better comprehension. The findings are significant for the quality of the under-resourced language translation and as a starting point for other machine translation methodologies such as statistical or deep learning-based translation.
Publisher
International Association for Educators and Researchers (IAER)
Subject
Electrical and Electronic Engineering,General Computer Science
Reference24 articles.
1. Yin-Lai Yeong, Tien-Ping Tan and Siti Khaotijah Mohammad, “Using Dictionary and Lemmatizer to Improve Low Resource English-Malay Statistical Machine Translation System”, Procedia Computer Science, Vol. 81, pp. 243–249, 2016, Online ISSN: 1877-0509, Published by Elsevier B.V., DOI: 10.1016/j.procs.2016.04.056, Available: https://www.sciencedirect.com/science/article/pii/S1877050916300709.
2. Tong Loong-Cheong, “English-Malay translation system: a laboratory prototype”, in Proceedings of the 11th Conference on Computational Linguistics, Bonn, Germany, August 1986, pp. 639–642, DOI: 10.3115/991365.991552, Available: https://dl.acm.org/doi/pdf/10.3115/991365.991552.
3. Tong Loong-Cheong, “The Computer Translation of Interrogatives from English to Malay”, RELC Journal, Vol. 18, No. 1, pp. 1–18, 1987, Published by SAGE Publications, DOI: 10.1177/003368828701800101, Available: https://journals.sagepub.com/doi/abs/10.1177/003368828701800101.
4. Kentaro Ogura, Francis Bond and Yoshifumi Ooyama, “A prototype Japanese-to-Malay Translation System”, in Proceedings of the MT Summit VIII, 13-17 September 1999, Singapore, pp. 444-448, Available: https://aclanthology.org/1999.mtsummit-1.66/.
5. Pidong Wang, Preslav Nakov and Hwee Tou Ng, “Source Language Adaptation Approaches for Resource-Poor Machine Translation”, Computational Linguistics, Vol. 42, No. 2, pp. 277–307, 2016, DOI: 10.1162/COLI_a_00248, Available: https://dl.acm.org/doi/10.1162/COLI_a_00248.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献