Affiliation:
1. Univ. of Passau, Passau, Germany
Abstract
Distributed data processing is becoming a reality. Businesses want to do it for many reasons, and they often must do it in order to stay competitive. While much of the infrastructure for distributed data processing is already there (e.g., modern network technology), a number of issues make distributed data processing still a complex undertaking: (1) distributed systems can become very large, involving thousands of heterogeneous sites including PCs and mainframe server machines; (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system; (3) legacy systems need to be integrated—such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the “textbook” architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intraquery paralleli sm, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses different kinds of distributed systems such as client-server, middleware (multitier), and heterogeneous database systems, and shows how query processing works in these systems.
Publisher
Association for Computing Machinery (ACM)
Subject
General Computer Science,Theoretical Computer Science
Reference160 articles.
1. ABITEBOUL S. BUNEMAN P. AND SUCIU D. 1999. Data on the Web from Relations to Semistructured Data and XML. MORKAU MKADDR.]] ABITEBOUL S. BUNEMAN P. AND SUCIU D. 1999. Data on the Web from Relations to Semistructured Data and XML. MORKAU MKADDR.]]
Cited by
392 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Random Data Distribution for Efficient Parallel Point Cloud Processing;AGILE: GIScience Series;2024-05-30
2. AlterEgo;Proceedings of the 7th International Workshop on Edge Systems, Analytics and Networking;2024-04-22
3. Optimal Query Plans for Geo-distributed Data Analytics at Scale;Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD);2024-01-04
4. Nutzung des Gesundheitssystems mit naturinspirierten Computertechniken: Ein Überblick und zukünftige Perspektiven;Von der Natur inspirierte intelligente Datenverarbeitungstechniken in der Bioinformatik;2024
5. Die NoSQL-Toolbox: Die NoSQL-Landschaft im Überblick;Schnelles und skalierbares Cloud-Datenmanagement;2024