Author:
Lam Monica S.,Wilson Robert P.
Abstract
This paper discusses three techniques useful in relaxing the constraints imposed by control flow on parallelism: control dependence analysis, executing multiple flows of control simultaneously, and speculative execution. We evaluate these techniques by using trace simulations to find the limits of parallelism for machines that employ different combinations of these techniques. We have three major results. First, local regions of code have limited parallelism, and control dependence analysis is useful in extracting global parallelism from different parts of a program. Second, a superscalar processor is fundamentally limited because it cannot execute independent regions of code concurrently. Higher performance can be obtained with machines, such as multiprocessors and dataflow machines, that can simultaneously follow multiple flows of control. Finally, without speculative execution to allow instructions to execute before their control dependences are resolved, only modest amounts of parallelism can be obtained for programs with complex control flow.
Publisher
Association for Computing Machinery (ACM)
Reference17 articles.
1. Global instruction scheduling for superscalar machines
2. A VLIW architecture for a trace scheduling compiler
3. An efficient method of computing static single assignment form
4. Trace Scheduling: A Technique for Global Microcode Compaction
5. P. Y. Hsu. Highly Concurrent Scalar Processing. PhD thesis University of Illinois at Urbana-Champaign 1986. P. Y. Hsu. Highly Concurrent Scalar Processing. PhD thesis University of Illinois at Urbana-Champaign 1986.
Cited by
54 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Study of Fine-grained Nested Parallelism in CDCL SAT Solvers;ACM Transactions on Parallel Computing;2021-09-30
2. An Instrumentation Approach for Hardware-Agnostic Software Characterization;International Journal of Parallel Programming;2016-03-25
3. Value State Flow Graph;ACM Transactions on Reconfigurable Technology and Systems;2016-02-03
4. Research Infrastructures for Hardware Accelerators;Synthesis Lectures on Computer Architecture;2015-11-18
5. Customizing VLIW processors from dynamically profiled execution traces;Microprocessors and Microsystems;2015-11