Affiliation:
1. Columbia University
2. NEC Research Institute
Abstract
We introduce a method for learning to find documents on the Web that contain answers to a given natural language question. In our approach, questions are transformed into new queries aimed at maximizing the probability of retrieving answers from existing information retrieval systems. The method involves automatically learning phrase features for classifying questions into different types, automatically generating candidate query transformations from a training set of question/answer pairs, and automatically evaluating the candidate transformations on target information retrieval systems such as real-world general purpose search engines. At run-time, questions are transformed into a set of queries, and reranking is performed on the documents retrieved. We present a prototype search engine,
Tritus
, that applies the method to Web search engines. Blind evaluation on a set of real queries from a Web search engine log shows that the method significantly outperforms the underlying search engines, and outperforms a commercial search engine specializing in question answering. Our methodology cleanly supports combining documents retrieved from different search engines, resulting in additional improvement with a system that combines search results from multiple Web search engines.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications
Cited by
26 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A novel word-graph-based query rewriting method for question answering;Data Technologies and Applications;2023-05-18
2. Query expansion based on term selection for Hindi – English cross lingual IR;Journal of King Saud University - Computer and Information Sciences;2020-03
3. Peer-to-Peer Data Management;Principles of Distributed Database Systems;2019-12-03
4. Parallel Database Systems;Principles of Distributed Database Systems;2019-12-03
5. Database Integration—Multidatabase Systems;Principles of Distributed Database Systems;2019-12-03