Exploring Snippets as a Dataset to Overcome Challenges in CLIR-Reference-Cited by-同舟云学术

Exploring Snippets as a Dataset to Overcome Challenges in CLIR

Published:2023 Issue: Volume:54 Page:01012
ISSN:2271-2097
Container-title:ITM Web of Conferences
language:
Short-container-title:ITM Web Conf.

Author:

Asthana Amit,Dwivedi Sanjay K.

Abstract

Cross-lingual information retrieval (CLIR) is a challenging task that requires overcoming linguistic barriers to match user queries with relevant documents in different languages. One of the major challenges in CLIR is the lack of parallel corpora, which hinders the development of effective translation models. This challenge can be addressed using snippets as a dataset to train CLIR models. Snippets can be automatically extracted from various sources, such as search engine result pages and can provide a rich and diverse set of collections for cross-lingual information retrieval. This paper initially discusses the challenges in CLIR and then explores the use of snippets as a dataset which can lead towards the development or improvements in the techniques to improve the retrieval effectiveness and further discusses the advantages and limitations of using snippets dataset in CLIR.

Publisher

EDP Sciences

Subject

General Medicine

Link

https://www.itm-conferences.org/10.1051/itmconf/20235401012/pdf

Reference15 articles.

1. Sharma Vijay Kumar, and Mittal Namita. “Cross lingual information retrieval (CLIR): Review of tools, challenges and translation approaches.” Information systems design and intelligent applications (2016): 699–708.

2. Zhou Dong, et al. “Query expansion for personalized cross-language information retrieval”, Semantic and Social Media Adaptation and Personalization (SMAP), 2015 10th International Workshop on. IEEE, 2015.

3. Seetha A., Das S. and Kumar M., “Evaluation of the English-Hindi Cross Language Information Retrieval System Based on Dictionary Based Query Translation Method,” 10th International Conference on Information Technology (ICIT 2007), 2007, pp. 56–61, doi: 10.1109/ICIT.2007.53.

4. Sun Renxu, Ong Chai-Huat, and Chua Tat-Seng. “Mining dependency relations for query expansion in passage retrieval.” Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. 2006.

5. Karadzhov Georgi, Nakov Preslav, Marquez Lluis, Barron-Cedeno Alberto, and Koychev Ivan. “Fully automated fact checking using external sources.” arXiv preprint arXiv:1710.00341 (2017).