Exploiting Representations from Statistical Machine Translation for Cross-Language Information Retrieval-Reference-Cited by-同舟云学术

Exploiting Representations from Statistical Machine Translation for Cross-Language Information Retrieval

Published:2014-10-28 Issue:4 Volume:32 Page:1-32
ISSN:1046-8188
Container-title:ACM Transactions on Information Systems
language:en
Short-container-title:ACM Trans. Inf. Syst.

Author:

Ture Ferhan¹,Lin Jimmy²

Affiliation:

1. Raytheon BBN Technologies, Cambridge, MA

2. University of Maryland at College Park

Abstract

This work explores how internal representations of modern statistical machine translation systems can be exploited for cross-language information retrieval. We tackle two core issues that are central to query translation: how to exploit context to generate more accurate translations and how to preserve ambiguity that may be present in the original query, thereby retaining a diverse set of translation alternatives. These two considerations are often in tension since ambiguity in natural language is typically resolved by exploiting context, but effective retrieval requires striking the right balance. We propose two novel query translation approaches: the grammar-based approach extracts translation probabilities from translation grammars, while the decoder-based approach takes advantage of n -best translation hypotheses. Both are context-sensitive , in contrast to a baseline context-insensitive approach that uses bilingual dictionaries for word-by-word translation. Experimental results show that by “opening up” modern statistical machine translation systems, we can access intermediate representations that yield high retrieval effectiveness. By combining evidence from multiple sources, we demonstrate significant improvements over competitive baselines on standard cross-language information retrieval test collections. In addition to effectiveness, the efficiency of our techniques are explored as well.

Funder

Defense Advanced Research Projects Agency

Division of Information and Intelligent Systems

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,General Business, Management and Accounting,Information Systems

Link

https://dl.acm.org/doi/pdf/10.1145/2644807

Reference68 articles.

1. Phase-based information retrieval

2. Phrasal translation and query expansion techniques for cross-language information retrieval

3. Information retrieval as statistical translation

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Semantic morphological variant selection and translation disambiguation for cross-lingual information retrieval;Multimedia Tools and Applications;2021-06-11

2. A Study of Neural Matching Models for Cross-lingual IR;Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval;2020-07-25

3. Information-seeking in multilingual digital libraries;Library Hi Tech;2020-01-03

4. Query-dependent learning to rank for cross-lingual information retrieval;Knowledge and Information Systems;2018-07-04

5. An iterative method for personalized results adaptation in cross-language search;Information Sciences;2018-03