Word Closure-Based Metamorphic Testing for Machine Translation

Author:

Xie Xiaoyuan1ORCID,Jin Shuo1ORCID,Chen Songqiang2ORCID,Cheung Shing-Chi2ORCID

Affiliation:

1. School of Computer Science, Wuhan University, China

2. Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, China

Abstract

With the wide application of machine translation, the testing of Machine Translation Systems (MTSs) has attracted much attention. Recent works apply Metamorphic Testing (MT) to address the oracle problem in MTS testing. Existing MT methods for MTS generally follow the workflow of input transformation and output relation comparison, which generates a follow-up input sentence by mutating the source input and compares the source and follow-up output translations to detect translation errors, respectively. These methods use various input transformations to generate the test case pairs and have successfully triggered numerous translation errors. However, they have limitations in performing fine-grained and rigorous output relation comparison and thus may report many false alarms and miss many true errors. In this paper, we propose a word closure-based output comparison method to address the limitations of the existing MTS MT methods. We first propose word closure as a new comparison unit, where each closure includes a group of correlated input and output words in the test case pair. Word closures suggest the linkages between the appropriate fragment in the source output translation and its counterpart in the follow-up output for comparison. Next, we compare the semantics on the level of word closure to identify the translation errors. In this way, we perform a fine-grained and rigorous semantic comparison for the outputs and thus realize more effective violation identification. We evaluate our method with the test cases generated by five existing input transformations and the translation outputs from three popular MTSs. Results show that our method significantly outperforms the existing works in violation identification by improving the precision and recall and achieving an average increase of 29.9% in F1 score. It also helps to increase the F1 score of translation error localization by 35.9%.

Publisher

Association for Computing Machinery (ACM)

Reference69 articles.

1. 2023. Bing Microsoft Translator. https://www.bing.com/translator.

2. 2023. Dataset replication package and supplementary material for this paper. https://github.com/imjinshuo/Word-Closure-Based-MT.

3. 2023. Google Translate. https://translate.google.com/.

4. 2023. Youdao Translate. https://translate.google.com/.

5. BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3