Affiliation:
1. Max Planck Institute for Software Systems, Kaiserslautern, Germany
Abstract
A classic problem in parallel computing is determining whether to execute a task in parallel or sequentially. If small tasks are executed in parallel, the task-creation overheads can be overwhelming. If large tasks are executed sequentially, processors may spin idle. This granularity problem, however well known, is not well understood: broadly applicable solutions remain elusive.
We propose techniques for controlling granularity in implicitly parallel programming languages. Using a cost semantics for a general-purpose language in the style of the lambda calculus with support for parallelism, we show that task-creation overheads can indeed slow down parallel execution by a multiplicative factor. We then propose oracle scheduling, a technique for reducing these overheads, which bases granularity decisions on estimates of task-execution times. We prove that, for a class of computations, oracle scheduling can reduce task creation overheads to a small fraction of the work without adversely affecting available parallelism, thereby leading to efficient parallel executions.
We realize oracle scheduling in practice by a combination of static and dynamic techniques. We require the programmer to provide the asymptotic complexity of every function and use run-time profiling to determine the implicit, architecture-specific constant factors. In our experiments, we were able to reduce overheads of parallelism down to between 3 and 13 percent, while achieving 6- to 10-fold speedups.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Analysis and Optimization of Task Granularity on the Java Virtual Machine;ACM Transactions on Programming Languages and Systems;2019-07-20
2. ElasticActor: An Actor System with Automatic Granularity Adjustment;International Journal of Parallel Programming;2018-12-07
3. ML for ML: Learning Cost Semantics by Experiment;Tools and Algorithms for the Construction and Analysis of Systems;2017
4. On the Application Task Granularity and the Interplay with the Scheduling Overhead in Many-Core Shared Memory Systems;2015 IEEE International Conference on Cluster Computing;2015-09
5. Lazy Scheduling;ACM Transactions on Programming Languages and Systems;2014-09-25