An in-depth analysis of the individual impact of controlled language rules on machine translation output: a mixed-methods approach

Author:

Marzouk ShaimaaORCID

Abstract

AbstractExamining the general impact of Controlled Language (CL) rules in the context of Machine Translation (MT) has been an area of research for many years. The present study focuses on the following question: how do CL rules impact MT output individually? By analysing a German corpus-based test suite of technical texts that have been translated into English by different MT systems, this study endeavours to answer this question at different levels: the general impact of CL rules (rule- and system-independent), their impact at rule level (system-independent) as well as at rule and system level. The results of five MT systems are analysed and contrasted: a rule-based system, a statistical system, two differently constructed hybrid systems, and a neural system. For this, a mixed-methods triangulation approach that includes error annotation, human evaluation, and automatic evaluation was applied. The data was analysed both qualitatively and quantitatively in terms of CL influence on the following parameters: number and type of MT errors, style and content quality, and scores of two automatic evaluation metrics. In line with many studies, the results show a general positive impact of the applied CL rules on the MT output. However, at rule level, only four rules proved to have positive effects on the aforementioned parameters; three rules had negative effects on the parameters; and two rules did not show any significant impact. At rule and system level, the rules affected the MT systems differently, as expected. Rules that had a positive impact on earlier MT approaches did not show the same impact on the neural MT approach. Furthermore, neural MT delivered distinctly better results than earlier MT approaches, namely the highest error-free, style and content quality rates both before and after applying the rules, which indicates that neural MT offers a promising solution that no longer requires CL rules for improving the MT output.

Funder

Johannes Gutenberg-Universität Mainz

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software

Reference43 articles.

1. Aikawa T, Schwartz L, King R, Corston-Oliver M, Lozano M (2007) Impact of controlled language on translation quality and post-editing in a statistical machine translation environment. In: Proceedings of the eleventh machine translation Summit 10–14 September, Copenhagen, Denmark, pp 1–7

2. Alonso Martin JA, Serra AC (2014) Integration of a machine translation system into the editorial process flow of a daily newspaper. Procesamiento Del Lenguaje Natural 53:193–196

3. Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: ACL 2005, Proceedings of the workshop on intrinsic and extrinsic evaluation measures for MT and/or summarization at the 43rd Annual meeting of the association for computational linguistics, Ann Arbor, Michigan, pp 65–72

4. Bernth A (1999) Controlling input and output of MT for greater user acceptance. In: Proceedings of the 21st conference of translating and the computer sponsored by ASLIB, 10–11 November 1999, London

5. Bernth A, Gdaniec C (2001) MTranslatability. In: Machine translation, December 2001, vol. 16, no. 3. Kluwer Academic Publishers, Dordrecht, pp 175–218

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Machine translation of standardised medical terminology using natural language processing: A scoping review;New Biotechnology;2023-11

2. Data generalization processing and fusion machine translation system based on virtual reality technology;Second International Conference on Electronic Information Technology (EIT 2023);2023-08-15

3. Teaching Pre-editing for Chinese-to-English MT: An Experiment with Controlled Chinese Rules;International Conference on Neural Computing for Advanced Applications;2023

4. Designing Controlled Chinese Rules for MT Pre-Editing of Product Description Text;International Journal of Translation, Interpretation, and Applied Linguistics;2022-11-18

5. Rei Miyata: controlled document authoring in a machine translation age;Language Resources and Evaluation;2022-10-05

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3