PIPELINING A SKEW-INSENSITIVE PARALLEL JOIN ALGORITHM-Reference-Cited by-同舟云学术

PIPELINING A SKEW-INSENSITIVE PARALLEL JOIN ALGORITHM

Published:2003-09 Issue:03 Volume:13 Page:317-328
ISSN:0129-6264
Container-title:Parallel Processing Letters
language:en
Short-container-title:Parallel Process. Lett.

Author:

BAMHA M.¹,EXBRAYAT M.¹

Affiliation:

1. LIFO, Université d'Orléans, BP 6759, 45067 Orléans cedex 2, France

Abstract

Most standard parallel join algorithms try to overcome data skews with a relatively static approach. The way they distribute data (and then computation) over nodes depends on a data re-distribution algorithm (hashing or range partitioning) that is determined before the actual join begins. On the contrary, our approach consists in pre-scanning data in order to choose an efficient join method for each given value of the join attribute. This approach has already proved to be efficient both theoretically and practically in our previous papers. In this paper we introduce a new pipelined version of our frequency adaptive join algorithm. The use of pipelining offers flexible strategies for resource allocation while avoiding unnecessary disk input/output of intermediate join results when computing multi-join queries. We present a detailed version of the algorithm and a cost analysis based on the BSP (Bulk Synchronous Parallel) model, showing that our pipelined algorithm achieves noticeable improvements compared to the sequential parallel version for multi-join queries while guaranteeing perfect balancing properties.

Publisher

World Scientific Pub Co Pte Lt

Subject

Hardware and Architecture,Theoretical Computer Science,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0129626403001306

Reference6 articles.

1. Parallel database systems

2. Effectiveness of parallel joins

3. A bridging model for parallel computation

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Scalable Similarity Join Algorithm Based on MapReduce and LSH;International Journal of Parallel Programming;2022-05-23

2. CPS implementation of a BSP composition primitive with application to the implementation of algorithmic skeletons;International Journal of Parallel, Emergent and Distributed Systems;2011-08

3. OSL: Optimized Bulk Synchronous Parallel Skeletons on Distributed Arrays;Lecture Notes in Computer Science;2009

4. An Efficient Pipelined Parallel Join Algorithm on Heterogeneous Distributed Architectures;Communications in Computer and Information Science;2009

5. Introduction to the special issue on semantics and costs models for high-level parallel programming;Computer Languages, Systems & Structures;2007-10