Affiliation:
1. Indian Institute of Technology Kharagpur, India
Abstract
In this article, a pseudo-relevance feedback (PRF)–based framework is presented for effective query expansion (QE). As candidate expansion terms, the proposed PRF framework considers the terms that are different morphological variants of the original query terms and are semantically close to them. This strategy of selecting expansion terms is expected to preserve the query intent after expansion. While judging the suitability of an expansion term with respect to a base query, two aspects of relation of the term with the query are considered. The first aspect probes to what extent the candidate term is semantically linked to the original query and the second one checks the extent to which the candidate term can supplement the base query terms. The semantic relationship between a query and expansion terms is modelled using bidirectional encoder representations from transformers (BERT). The degree of similarity is used to estimate the relative importance of the expansion terms with respect to the query. The quantified relative importance is used to assign weights of the expansion terms in the final query. Finally, the expansion terms are grouped into semantic clusters to strengthen the original query intent. A set of experiments was performed on three different Text REtrieval Conference (TREC) collections to experimentally validate the effectiveness of the proposed QE algorithm. The results show that the proposed QE approach yields competitive retrieval effectiveness over the existing state-of-the-art PRF methods in terms of the mean average precision (MAP) and precision P at position 10 (P@10).
Subject
Library and Information Sciences,Information Systems