Affiliation:
1. Purdue University, West Lafayette, IN, USA
Abstract
Among techniques for parallelizing sequential codes, privatization is a common and significant transformation performed by both compilers and runtime parallelizing systems. Without privatization, repetitive updates to the same data structures often introduce spurious data dependencies that hide the inherent parallelism. Unfortunately, it remains a significant challenge to compilers to automatically privatize dynamic and recursive data structures which appear frequently in real applications written in languages such as C/C++. This is because such languages lack a naming mechanism to define the address range of a pointer-based data structure, in contrast to arrays with explicitly declared bounds. In this paper we present a novel solution to this difficult problem by expanding general data structures such that memory accesses issued from different threads to contentious data structures are directed to different data fields. Based on compile-time type checking and a data dependence graph, this aggressive extension to the traditional scalar and array expansion isolates the address ranges among different threads, without struggling with privatization based on thread-private stacks, such that the targeted loop can be effectively parallelized. With this method fully implemented in GCC, experiments are conducted on a set of programs from well-known benchmark suites such as Mibench, MediaBench II and SPECint. Results show that the new approach can lead to a high speedup when executing the transformed code on multiple cores.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Reference40 articles.
1. http://http://www.spec.org/cpu/. http://http://www.spec.org/cpu/.
2. http://gcc.gnu.org/projects/gomp/. http://gcc.gnu.org/projects/gomp/.
3. http://software.intel.com/en-us/intel-compilers/. http://software.intel.com/en-us/intel-compilers/.
4. Automatic generation of nested, fork-join parallelism
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Comparative Analysis of Sequential and Parallel Computing for Object Detection Using Deep Learning Model;2023 24th International Arab Conference on Information Technology (ACIT);2023-12-06
2. LD;ACM Transactions on Architecture and Code Optimization;2017-04-14
3. POSTER;Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming;2017-01-26
4. $${\mathrm{DS}}_{\mathrm{spirit}}$$ DS spirit : a data dependence and stride reference patterns profiling infrastructure;The Journal of Supercomputing;2016-01-16
5. Performance Evaluation and Enhancement of Process-Based Parallel Loop Execution;International Journal of Parallel Programming;2015-10-06