Author:
Khoo Christopher S.G.,Wan Kwok‐Wai
Abstract
A relevancy‐ranking algorithm for a natural language interface to Boolean online public access catalogs (OPACs) was formulated and compared with that currently used in a knowledge‐based search interface called the E‐Referencer, being developed by the authors. The algorithm makes use of seven well‐known ranking criteria: breadth of match, section weighting, proximity of query words, variant word forms (stemming), document frequency, term frequency and document length. The algorithm converts a natural language query into a series of increasingly broader Boolean search statements. In a small experiment with ten subjects in which the algorithm was simulated by hand, the algorithm obtained good results with a mean overall precision of 0.42 and mean average precision of 0.62, representing a 27 percent improvement in precision and 41 percent improvement in average precision compared to the E‐Referencer. The usefulness of each step in the algorithm was analyzed and suggestions are made for improving the algorithm.
Subject
Library and Information Sciences,Computer Science Applications
Reference31 articles.
1. Baeza‐Yates, R. and Ribeiro‐Neto, B. (1999), Modern Information Retrieval, ACM Press, New York, NY.
2. Borgman, C.L. (1996), “Why are online catalogs still hard to use?”, Journal of the American Society for Information Science, Vol. 47 No. 7, pp. 493‐503.
3. Croft, W.B. (1986), “Boolean queries and term dependencies in probabilistic retrieval models”, Journal of the American Society for Information Science, Vol. 37 No. 2, pp. 71‐7.
4. Davis, C.H. (1995), “Beyond Boole: the next logical step”, Bulletin of the American Society for Information Science, Vol. 21 No. 5, pp. 17‐20.
5. Davis, C.H. and McKim, G.W. (1999), “Systematic weighting and ranking: cutting the Gordian knot”, Journal of the American Society for Information Science, Vol. 50 No. 7, pp. 626‐8.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献