Affiliation:
1. University of Sfax Faculty of Economics and Management of Sfax, Sfax, Tunisia
2. computer science, Faculty of science Sfax, Sfax Tunisia
Abstract
Translation from the mother tongue, including the Tunisian dialect, to modern standard Arabic is a highly significant field in natural language processing due to its wide range of applications and associated benefits. Recently, researchers have shown increased interest in the Tunisian dialect, primarily driven by the massive volume of content generated spontaneously by Tunisians on social media follow-ing the revolution. This paper presents two distinct translators for converting the Tunisian dialect into Modern Standard Arabic. The first translator utilizes a rule-based approach, employing a collection of finite state transducers and a bilingual dictionary derived from the study corpus. On the other hand, the second translator relies on deep learning models, specifically the sequence-to-sequence trans-former model and a parallel corpus. To assess, evaluate, and compare the performance of the two translators, we conducted tests using a parallel corpus comprising 8,599 words. The results achieved by both translators are noteworthy. The translator based on finite state transducers achieved a blue score of 56.65, while the transformer model-based translator achieved a higher score of 66.07.
Publisher
Association for Computing Machinery (ACM)
Reference29 articles.
1. Sina Ahmadi and Mariam Masoud. 2020. Towards Machine Translation for the Kurdish Language. CoRR abs/2010.06041(2020).
2. Neural Machine Translation from Jordanian Dialect to Modern Standard Arabic
3. Lei Jimmy Ba Jamie Ryan Kiros and Geoffrey E. Hinton. 2016. Layer Normalization. CoRR abs/1607.06450(2016).
4. ParaMT: A Paraphraser for Machine Translation