Affiliation:
1. Univ. of Massachusetts, Amherst
2. Ing. C. Olivetti & Co., Pisa, Italy
Abstract
Signature files provide an efficient access method for text in documents, but retrieval is usually limited to finding documents that contain a specified Boolean pattern of words. Effective retrieval requires that documents with similar meanings be found through a process of plausible inference. The simplest way of implementing this retrieval process is to rank documents in order of their probability of relevance. In this paper techniques are described for implementing probabilistic ranking strategies with sequential and bit-sliced signature tiles and the limitations of these implementations with regard to their effectiveness are pointed out. A detailed comparison is made between signature-based ranking techniques and ranking using term-based document representatives and inverted files. The comparison shows that term-based representations are at least competitive (in terms of efficiency) with signature files and, in some situations, superior.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Science Applications,General Business, Management and Accounting,Information Systems
Reference33 articles.
1. CHRISTODOULAKIS S. AND FALOUTSOS C. Design considerations for a message file server. IEEE Trans. Softw. Eng. SE-IO (1984) 201-210. CHRISTODOULAKIS S. AND FALOUTSOS C. Design considerations for a message file server. IEEE Trans. Softw. Eng. SE-IO (1984) 201-210.
2. Document representation in probabilistic models of information retrieval;CROFT W.B;J. Am. Soc. Inf. Sci.,1981
Cited by
35 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献