Study on Unknown Term Translation Mining from Google Snippets-Reference-Cited by-同舟云学术

Study on Unknown Term Translation Mining from Google Snippets

Published:2019-08-28 Issue:9 Volume:10 Page:267
ISSN:2078-2489
Container-title:Information
language:en
Short-container-title:Information

Author:

Li Bin,Yao Jianmin

Abstract

Bilingual web pages are widely used to mine translations of unknown terms. This study focused on an effective solution for obtaining relevant web pages, extracting translations with correct lexical boundaries, and ranking the translation candidates. This research adopted co-occurrence information to obtain the subject terms and then expanded the source query with the translation of the subject terms to collect effective bilingual search engine snippets. Afterwards, valid candidates were extracted from small-sized, noisy bilingual corpora using an improved frequency change measurement that combines adjacent information. This research developed a method that considers surface patterns, frequency–distance, and phonetic features to elect an appropriate translation. The experimental results revealed that the proposed method performed remarkably well for mining translations of unknown terms.

Funder

National Natural Science Foundation of China

Key Program of the Foundation for Young Talents in the Colleges of Anhui Province

Publisher

MDPI AG

Subject

Information Systems

Link

https://www.mdpi.com/2078-2489/10/9/267/pdf

Reference31 articles.

1. Cross-Lingual Information Retrieval: A Dictionary-Based Query Translation Approach;Sharma,2018

2. Extracting Multilingual Lexicons from Parallel Corpora

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Comparative Analysis of Information Retrieval Models on Quran Dataset in Cross-Language Information Retrieval Systems;IEEE Access;2021

2. Direct Answers in Google Search Results;IEEE Access;2020