Affiliation:
1. University of Massachusetts, Amherst, MA
Abstract
We explore the relation between classical probabilistic models of information retrieval and the emerging language modeling approaches. It has long been recognized that the primary obstacle to effective performance of classical models is the need to estimate a relevance model: probabilities of words in the relevant class. We propose a novel technique for estimating these probabilities using the query alone. We demonstrate that our technique can produce highly accurate relevance models, addressing important notions of synonymy and polysemy. Our experiments show relevance models outperforming baseline language modeling systems on TREC retrieval and TDT tracking tasks. The main contribution of this work is an effective formal method for estimating a relevance model with no training data.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Management Information Systems
Cited by
48 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. The Surprising Effectiveness of Rankers trained on Expanded Queries;Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval;2024-07-10
2. Old IR Methods Meet RAG;Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval;2024-07-10
3. Utilizing passage‐level relevance and kernel pooling for enhancing BERT‐based document reranking;Computational Intelligence;2024-06
4. How to personalize and whether to personalize? Candidate documents decide;Knowledge and Information Systems;2024-05-27
5. Improving Retrieval in Theme-specific Applications using a Corpus Topical Taxonomy;Proceedings of the ACM Web Conference 2024;2024-05-13