Affiliation:
1. ATR Spoken Language Communication Research Laboratories, Kyoto-fu, Japan
Abstract
An Example-Based Machine Translation (EBMT) system, whose translation example unit is a sentence, can produce an accurate and natural translation if translation examples similar enough to an input sentence are retrieved. Such a system, however, suffers from the problem of narrow coverage. To reduce the problem, a large-scale parallel corpus is required and, therefore, an efficient method is needed to retrieve translation examples from a large-scale corpus. The authors propose an efficient retrieval method for a sentence-wise EBMT using edit-distance. The proposed retrieval method efficiently retrieves the most similar sentences using the measure of edit-distance without omissions. The proposed method employs search-space division, word graphs, and an A* search algorithm. The performance of the EBMT was evaluated through Japanese-to-English translation experiments using a bilingual corpus comprising hundreds of thousands of sentences from a travel conversation domain. The EBMT system achieved a high-quality translation ability by using a large corpus and also achieved efficient processing by using the proposed retrieval method.
Publisher
Association for Computing Machinery (ACM)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献