Recent advances in Apertium, a free/open-source rule-based machine translation platform for low-resource languages-Reference-Cited by-同舟云学术

Recent advances in Apertium, a free/open-source rule-based machine translation platform for low-resource languages

Published:2021-10-18 Issue:4 Volume:35 Page:475-502
ISSN:0922-6567
Container-title:Machine Translation
language:en
Short-container-title:Machine Translation

Author:

Khanna Tanmai^ORCID,Washington Jonathan N.^ORCID,Tyers Francis M.^ORCID,Bayatlı Sevilay^ORCID,Swanson Daniel G.^ORCID,Pirinen Tommi A.^ORCID,Tang Irene^ORCID,Alòs i Font Hèctor^ORCID

Abstract

AbstractThis paper presents an overview of Apertium, a free and open-source rule-based machine translation platform. Translation in Apertium happens through a pipeline of modular tools, and the platform continues to be improved as more language pairs are added. Several advances have been implemented since the last publication, including some new optional modules: a module that allows rules to process recursive structures at the structural transfer stage, a module that deals with contiguous and discontiguous multi-word expressions, and a module that resolves anaphora to aid translation. Also highlighted is the hybridisation of Apertium through statistical modules that augment the pipeline, and statistical methods that augment existing modules. This includes morphological disambiguation, weighted structural transfer, and lexical selection modules that learn from limited data. The paper also discusses how a platform like Apertium can be a critical part of access to language technology for so-called low-resource languages, which might be ignored or deemed unapproachable by popular corpus-based translation technologies. Finally, the paper presents some of the released and unreleased language pairs, concluding with a brief look at some supplementary Apertium tools that prove valuable to users as well as language developers. All Apertium-related code, including language data, is free/open-source and available at https://github.com/apertium.

Funder

Google

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software

Link

https://link.springer.com/content/pdf/10.1007/s10590-021-09260-6.pdf

Reference55 articles.

1. Antonsen L, Trosterud T, Tyers FM (2017) A North Saami to South Saami machine translation prototype. Lecture Notes Artif Intell 4:11–27. https://doi.org/10.3384/nejlt.2000-1533.1642

2. Baldwin B (1997) CogNIAC: high precision coreference with limited knowledge and linguistic resources. In: Proceedings of a Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts, Association for Computational Linguistics, pp 38–45, 10.3115/1598819.1598825, http://portal.acm.org/citation.cfm?doid=1598819.1598825

3. Bayatli S, Karanfil G, Gökırmak M, Tyers FM (2018a) Finite-state morphological analysis for Gagauz. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), European Language Resources Association (ELRA), Miyazaki, Japan, https://www.aclweb.org/anthology/L18-1411

4. Bayatli S, Kurnaz S, Salimzianov I, Washington JN, Tyers FM (2018b) Rule-based machine translation from Kazakh to Turkish. In: European Association for Machine Translation (EAMT), pp 49–58

5. Bayatli S, Kurnaz S, Ali A, Washington JN, Tyers FM (2020) Unsupervised weighting of transfer rules in rule-based machine translation using maximum-entropy approach. J Inf Sci Eng 36(2):309–322. https://doi.org/10.6688/JISE.202003_36(2).0010

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploration on Advanced Intelligent Algorithms of Artificial Intelligence for Verb Recognition in Machine Translation;ACM Transactions on Asian and Low-Resource Language Information Processing;2024-08-08

2. Preserving Sasak Dialectal Features in English to Sasak Machine Translation through Locked Tokenization with Transformer Models;2024 International Seminar on Intelligent Technology and Its Applications (ISITIA);2024-07-10

3. Optimization of data analysis models for low‐resource Eurasian languages using machine translation;Internet Technology Letters;2024-04-18

4. Real-Time Speech to Sign Language Translation Using Machine and Deep Learning;2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO);2024-03-14

5. Online English Machine Translation Algorithm Based on Large Language Model;2024 3rd International Conference on Sentiment Analysis and Deep Learning (ICSADL);2024-03-13