Asymmetry-aware work-stealing runtimes-Reference-Cited by-同舟云学术

Asymmetry-aware work-stealing runtimes

Published:2016-10-12 Issue:3 Volume:44 Page:40-52
ISSN:0163-5964
Container-title:ACM SIGARCH Computer Architecture News
language:en
Short-container-title:SIGARCH Comput. Archit. News

Author:

Torng Christopher¹,Wang Moyang¹,Batten Christopher¹

Affiliation:

1. Cornell University

Abstract

Amdahl's law provides architects a compelling reason to introduce system asymmetry to optimize for both serial and parallel regions of execution. Asymmetry in a multicore processor can arise statically (e.g., from core microarchitecture) or dynamically (e.g., applying dynamic voltage/frequency scaling). Work stealing is an increasingly popular approach to task distribution that elegantly balances task-based parallelism across multiple worker threads. In this paper, we propose asymmetry-aware work-stealing (AAWS) runtimes, which are carefully designed to exploit both the static and dynamic asymmetry in modern systems. AAWS runtimes use three key hardware/software techniques: work-pacing, work-sprinting, and work-mugging. Work-pacing and work-sprinting are novel techniques that combine a marginal-utility-based approach with integrated voltage regulators to improve performance and energy efficiency in high- and low-parallel regions. Work-mugging is a previously proposed technique that enables a waiting big core to preemptively migrate work from a busy little core. We propose a simple implementation of work-mugging based on lightweight user-level interrupts. We use a vertically integrated research methodology spanning software, architecture, and VLSI to make the case that holistically combining static asymmetry, dynamic asymmetry, and work-stealing runtimes can improve both performance and energy efficiency in future multicore systems.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3007787.3001142

Reference64 articles.

1. A. Annamalai etal An Opportunistic Prediction-Based Thread Scheduling to Maximize Throughput/Watt in AMPs. Int'l Conf. on Parallel Architectures and Compilation Techniques Sep 2013. A. Annamalai et al. An Opportunistic Prediction-Based Thread Scheduling to Maximize Throughput/Watt in AMPs. Int'l Conf. on Parallel Architectures and Compilation Techniques Sep 2013.

2. Energy-performance tradeoffs in processor architecture and circuit design

3. Online Scheduling of Parallel Programs on Heterogeneous Systems with Applications to Cilk

4. Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors

5. The PARSEC benchmark suite

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Rapid Development of OS Support with PMCSched for Scheduling on Asymmetric Multicore Systems;Euro-Par 2022: Parallel Processing Workshops;2023