Affiliation:
1. Univ. of Massachusetts, Amherst, MA
2. Village Networks, Hazlet, NJ
Abstract
The information explosion across the Internet and elswhere offers access to an increasing number of document collections. In order for users to effectively access these collections, information retrieval (IR) systems must provide coordinated, concurrent, and distributed access. In this article, we explore how to achieve scalable performance in a distributed system for collection sizes ranging from 1GB to 128GB. We implement a fully functional distributed IR system based on a multithreaded version of the Inquery simulation model. We measure performance as a function of system parameters such as client command rate, number of document collections, ter ms per query, query term frequency, number of answers returned, and command mixture. Our results show that it is important to model both query and document commands because the heterogeneity of commands significantly impacts performance. Based on our results, we recommend simple changes to the prototype and evaluate the changes using the simulator. Because of the significant resource demands of information retrieval, it is not difficult to generate workloads that overwhelm system resources regardless of the architecture. However under some realistic workloads, we demonstrate system organizations for which response time gracefully degrades as the workload increases and performance scales with the number of processors. This scalable architecture includes a surprisingly small number of brokers through which a large number of clients and servers communicate.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Science Applications,General Business, Management and Accounting,Information Systems
Reference42 articles.
1. BELL D. AND GRIMSON J. 1992. Distributed Database Systems. Addison-Wesley Reading MA. BELL D. AND GRIMSON J. 1992. Distributed Database Systems. Addison-Wesley Reading MA.
2. Performance Modeling of Distributed Object-Oriented Database Systems
3. Performance evaluation of a distributed architecture for information retrieval
Cited by
31 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献