Affiliation:
1. University of Massachusetts, Amherst, MA
Abstract
Query reformulation modifies the original query with the aim of better matching the vocabulary of the relevant documents, and consequently improving ranking effectiveness. Previous models typically generate words and phrases related to the original query, but do not consider how these words and phrases would fit together in actual queries. In this article, a novel framework is proposed that models reformulation as a distribution of actual queries, where each query is a variation of the original query. This approach considers an actual query as the basic unit and thus captures important query-level dependencies between words and phrases. An implementation of this framework that only uses publicly available resources is proposed, which makes fair comparisons with other methods using TREC collections possible. Specifically, this implementation consists of a query generation step that analyzes the passages containing query words to generate reformulated queries and a probability estimation step that learns a distribution for reformulated queries by optimizing the retrieval performance. Experiments on TREC collections show that the proposed model can significantly outperform previous reformulation models.
Funder
Division of Information and Intelligent Systems
Center for Intelligent Information Retrieval
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Science Applications,General Business, Management and Accounting,Information Systems
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. DeepQFM: a deep learning based query facets mining method;Information Retrieval Journal;2023-10-30
2. Improving Search Clarification with Structured Information Extracted from Search Results;Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2023-08-04
3. Query Sub-intent Mining by Incorporating Search Results with Query Logs for Information Retrieval;2023 IEEE 8th International Conference on Big Data Analytics (ICBDA);2023-03-03
4. Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations;Proceedings of the 31st ACM International Conference on Information & Knowledge Management;2022-10-17
5. Revisiting Open Domain Query Facet Extraction and Generation;Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval;2022-08-23