Affiliation:
1. Nanyang Technological University
2. A*STAR, Institute of HPC, Singapore
Abstract
Query co-processing on graphics processors (GPUs) has become an effective means to improve the performance of main memory databases. However, the relatively low bandwidth and high latency of the PCI-e bus are usually bottleneck issues for co-processing. Recently, coupled CPU-GPU architectures have received a lot of attention, e.g. AMD APUs with the CPU and the GPU integrated into a single chip. That opens up new opportunities for optimizing query co-processing. In this paper, we experimentally revisit hash joins, one of the most important join algorithms for main memory databases, on a coupled CPU-GPU architecture. Particularly, we study the fine-grained co-processing mechanisms on hash joins with and without partitioning. The co-processing outlines an interesting design space. We extend existing cost models to automatically guide decisions on the design space. Our experimental results on a recent AMD APU show that (1) the coupled architecture enables fine-grained co-processing and cache reuses, which are inefficient on discrete CPU-GPU architectures; (2) the cost model can automatically guide the design and tuning knobs in the design space; (3) fine-grained co-processing achieves up to 53%, 35% and 28% performance improvement over CPU-only, GPU-only and conventional CPU-GPU co-processing, respectively. We believe that the insights and implications from this study are initial yet important for further research on query co-processing on coupled CPU-GPU architectures.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
76 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Heterogeneous Intra-Pipeline Device-Parallel Aggregations;Proceedings of the 20th International Workshop on Data Management on New Hardware;2024-06-09
2. CPU and GPU Hash Joins on Skewed Data;2024 IEEE 40th International Conference on Data Engineering Workshops (ICDEW);2024-05-13
3. Accelerating Merkle Patricia Trie with GPU;Proceedings of the VLDB Endowment;2024-04
4. Split-bucket partition (SBP): a novel execution model for top-K and selection algorithms on GPUs;The Journal of Supercomputing;2024-03-29
5. Optimising group-by and aggregation on the coupled CPU-GPU architecture;International Journal of Computational Science and Engineering;2024