On Relevance, Probabilistic Indexing and Information Retrieval-Reference-Cited by-同舟云学术

On Relevance, Probabilistic Indexing and Information Retrieval

Published:1960-07 Issue:3 Volume:7 Page:216-244
ISSN:0004-5411
Container-title:Journal of the ACM
language:en
Short-container-title:J. ACM

Author:

Maron M. E.¹,Kuhns J. L.²

Affiliation:

1. The RAND Corporation, Santa Monica, California

2. Ramo-Wooldridge, Canoga Park, California

Abstract

This paper reports on a novel technique for literature indexing and searching in a mechanized library system. The notion of relevance is taken as the key concept in the theory of information retrieval and a comparative concept of relevance is explicated in terms of the theory of probability. The resulting technique called “Probabilistic Indexing,” allows a computing machine, given a request for information, to make a statistical inference and derive a number (called the “relevance number”) for each document, which is a measure of the probability that the document will satisfy the given request. The result of a search is an ordered list of those documents which satisfy the request ranked according to their probable relevance. The paper goes on to show that whereas in a conventional library system the cross-referencing (“see” and “see also”) is based solely on the “semantical closeness” between index terms, statistical measures of closeness between index terms can be defined and computed. Thus, given an arbitrary request consisting of one (or many) index term(s), a machine can elaborate on it to increase the probability of selecting relevant documents that would not otherwise have been selected. Finally, the paper suggests an interpretation of the whole library problem as one where the request is considered as a clue on the basis of which the library system makes a concatenated statistical inference in order to provide as an output an ordered list of those documents which most probably satisfy the information needs of the user.

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software

Link

https://dl.acm.org/doi/pdf/10.1145/321033.321035

Cited by 488 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Discovering Time-Varying Public Interest for COVID-19 Case Prediction in South Korea Using Search Engine Queries: Infodemiology Study (Preprint);2024-06-21

2. Research on a Capsule Network Text Classification Method with a Self-Attention Mechanism;Symmetry;2024-04-24

3. Text Representation Based on WT-GloVe Word Vector Weighting Model;2024 International Conference on Intelligent Computing and Robotics (ICICR);2024-04-12

4. Using automated text classification to explore uncertainty in NICE appraisals for drugs for rare diseases;International Journal of Technology Assessment in Health Care;2024

5. Data Augmentation With Semantic Enrichment for Deep Learning Invoice Text Classification;IEEE Access;2024