Affiliation:
1. Yahoo Research
2. Technion — Israel Institute of Technology
Abstract
Document retrieval methods that utilize relevance feedback often induce a single query model from the set of feedback documents, specifically, the relevant documents. We empirically show that for a few state-of-the-art query-model induction methods, retrieval performance can be significantly improved by constructing the query model from a subset of the relevant documents rather than from all of them. Motivated by this finding, we propose a new approach for relevance-feedback-based retrieval. The approach, derived from the risk minimization framework, is based on utilizing multiple query models induced from all subsets of the given relevant documents. Empirical evaluation shows that the approach posts performance that is statistically significantly better than that of applying the standard practice of utilizing a single query model induced from the relevant documents. While the average relative improvements are small to moderate, the robustness of the approach is substantially higher than that of a variety of reference comparison methods that address various challenges in using relevance feedback.
Funder
Israel Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Science Applications,General Business, Management and Accounting,Information Systems
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献