Affiliation:
1. Fondazione Ugo Bordoni, Roma, Italy
2. WIND Telecomunicazioni S.p.A, Roma, Italy
Abstract
In this article we consider methods for automatic query expansion from top retrieved documents (i.e., retrieval feedback) that make use of various functions for scoring expansion terms within Rocchio's classical reweighting scheme. An analytical comparison shows that the retrieval performance of methods based on distinct term-scoring functions is comparable on the whole query set but differs considerably on single queries, consistent with the fact that the ordered sets of expansion terms suggested for each query by the different functions are largely uncorrelated. Motivated by these findings, we argue that the results of multiple functions can be merged, by analogy with ensembling classifiers, and present a simple combination technique based on the rank values of the suggested terms. The combined retrieval feedback method is effective not only with respect to unexpanded queries but also to any individual method, with notable improvements on the system's precision. Furthermore, the combined method is robust with respect to variation of experimental parameters and it is beneficial even when the same information needs are expressed with shorter queries.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Science Applications,General Business, Management and Accounting,Information Systems
Cited by
34 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献