Cross-Language Latent Relational Search between Japanese and English Languages Using a Web Corpus-Reference-Cited by-同舟云学术

Cross-Language Latent Relational Search between Japanese and English Languages Using a Web Corpus

Published:2012-09 Issue:3 Volume:11 Page:1-33
ISSN:1530-0226
Container-title:ACM Transactions on Asian Language Information Processing
language:en
Short-container-title:ACM Transactions on Asian Language Information Processing

Author:

Duc Nguyen Tuan¹,Bollegala Danushka¹,Ishizuka Mitsuru¹

Affiliation:

1. The University of Tokyo

Abstract

Latent relational search is a novel entity retrieval paradigm based on the proportional analogy between two entity pairs. Given a latent relational search query {(Japan, Tokyo), (France, ?)}, a latent relational search engine is expected to retrieve and rank the entity “Paris” as the first answer in the result list. A latent relational search engine extracts entities and relations between those entities from a corpus, such as the Web. Moreover, from some supporting sentences in the corpus, (e.g., “Tokyo is the capital of Japan” and “Paris is the capital and biggest city of France”), the search engine must recognize the relational similarity between the two entity pairs. In cross-language latent relational search, the entity pairs as well as the supporting sentences of the first entity pair and of the second entity pair are in different languages. Therefore, the search engine must recognize similar semantic relations across languages. In this article, we study the problem of cross-language latent relational search between Japanese and English using Web data. To perform cross-language latent relational search in high speed, we propose a multi-lingual indexing method for storing entities and lexical patterns that represent the semantic relations extracted from Web corpora. We then propose a hybrid lexical pattern clustering algorithm to capture the semantic similarity between lexical patterns across languages. Using this algorithm, we can precisely measure the relational similarity between entity pairs across languages, thereby achieving high precision in the task of cross-language latent relational search. Experiments show that the proposed method achieves an MRR of 0.605 on Japanese-English cross-language latent relational search query sets and it also achieves a reasonable performance on the INEX Entity Ranking task.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/2334801.2334805

Reference51 articles.

1. Measuring the similarity between implicit semantic relations from the web

2. Measuring the similarity between implicit semantic relations using web search engines

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Transitivity of transformation matrices to bridge word vector spaces over 1000 years;The Journal of Supercomputing;2021-02-19

2. Multilingual Open Information Extraction: Challenges and Opportunities;Information;2019-07-02

3. Jointly learning word embeddings using a corpus and a knowledge base;PLOS ONE;2018-03-12

4. Arabic Cross-Language Information Retrieval;ACM Transactions on Asian and Low-Resource Language Information Processing;2016-03-08

5. A Cross-Lingual Similarity Measure for Detecting Biomedical Term Translations;PLOS ONE;2015-06-01