Affiliation:
1. Cornell University, Ithaca, New York
Abstract
Natural language query formulations exhibit advantages over artificial language statements since they permit the user to approach the retrieval environment without prior training and without using intermediaries. To obtain adequate retrieval output, it is however necessary to emphasize the good terms and to deemphasize the bad ones. The usefulness of the terms in a natural language vocabulary is first characterized in terms of their frequency distribution over the documents of a collection. The construction of "good" natural language vocabularies is then described, and methods are given for improving the vocabulary by transforming terms that operate poorly for retrieval purposes into better ones.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Management Information Systems
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献