Affiliation:
1. Massachusetts Institute of Technology, Cambridge, MA
Abstract
We present a new technique,
early phase termination
, for eliminating idle processors in parallel computations that use barrier synchronization. This technique simply terminates each parallel phaseas soon as there are too few remaining tasks to keep all of the processors busy.
Although this technique completely eliminates the idling that would other wise occur at barrier synchronization points, it may also change the computation and therefore the result that the computation produces. We address this issue by providing
probabilistic distortion models
that characterize how the use of early phase termination distorts the result that the computation produces. Our experimental results show that for our set of benchmark applications, 1) early phase termination can improve the performance of the parallel computation, 2) the distortion is small (or can be made to be small with the use of an appropriate compensation technique) and 3) the distortion models provide accurate and tight distortion bounds. These bounds can enable users to evaluate the effect of early phase termination and confidently accept results from parallel computations that use this technique if they find the distortion bounds to be acceptable.
Finally, we identify a general computational pattern that works well with early phase termination and explain why computations that exhibit this pattern can tolerate the early termination of parallel tasks without producing unacceptable results.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Anwendung III: Design Space Exploration;Automatisierte Analyse von virtuellen Prototypen auf der Ebene elektronischer Systeme;2023
2. PREASC;ACM Transactions on Design Automation of Electronic Systems;2020-10-02
3. Application III: Design Space Exploration;Automated Analysis of Virtual Prototypes at the Electronic System Level;2020
4. Approximate computing for multithreaded programs in shared memory architectures;Proceedings of the 17th ACM-IEEE International Conference on Formal Methods and Models for System Design;2019-10-09
5. Replica;Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems;2019-04-04