Two-Stage Hypotheses Generation for Spoken Language Translation-Reference-Cited by-同舟云学术

Two-Stage Hypotheses Generation for Spoken Language Translation

Published:2009-03 Issue:1 Volume:8 Page:1-22
ISSN:1530-0226
Container-title:ACM Transactions on Asian Language Information Processing
language:en
Short-container-title:ACM Transactions on Asian Language Information Processing

Author:

Chen Boxing¹,Zhang Min¹,Aw Ai Ti¹

Affiliation:

1. Institute for Infocomm Research, Singapore

Abstract

Spoken Language Translation (SLT) is the research area that focuses on the translation of speech or text between two spoken languages. Phrase-based and syntax-based methods represent the state-of-the-art for statistical machine translation (SMT). The phrase-based method specializes in modeling local reorderings and translations of multiword expressions. The syntax-based method is enhanced by using syntactic knowledge, which can better model long word reorderings, discontinuous phrases, and syntactic structure. In this article, we leverage on the strength of these two methods and propose a strategy based on multiple hypotheses generation in a two-stage framework for spoken language translation. The hypotheses are generated in two stages, namely, decoding and regeneration. In the decoding stage, we apply state-of-the-art, phrase-based, and syntax-based methods to generate basic translation hypotheses. Then in the regeneration stage, much more hypotheses that cannot be captured by the decoding algorithms are produced from the basic hypotheses. We study three regeneration methods: redecoding, n-gram expansion, and confusion network in the second stage. Finally, an additional reranking pass is introduced to select the translation outputs by a linear combination of rescoring models. Experimental results on the Chinese-to-English IWSLT-2006 challenge task of translating the transcription of spontaneous speech show that the proposed mechanism achieves significant improvements over the baseline of about 2.80 BLEU-score.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/1482343.1482347

Reference68 articles.

1. Stochastic Finite-State Models for Spoken Language Machine Translation

2. Language translation apparatus and methods using context-based translation models;Berger A. L.;U.S. Patent,1996