Affiliation:
1. University of California, Irvine
Abstract
We have built a runtime compilation system that takes unmodified sequential binaries and improves their performance on off-the-shelf multiprocessors using dynamic vectorization and loop-level parallelization techniques. Our system, Azure, is purely software based and requires no specific hardware support for speculative thread execution, yet it is able to break even in most cases; that is, the achieved speedup exceeds the cost of runtime monitoring and compilation, often by significant amounts.
Key to this remarkable performance is an offline preprocessing step that extracts a
mostly correct
control flow graph (CFG) from the binary program ahead of time. This statically obtained CFG is incomplete in that it may be missing some edges corresponding to computed branches. We describe how such additional control flow edges are discovered and handled at runtime, so that an incomplete static analysis never leads to an incorrect optimization result.
The availability of a
mostly correct
CFG enables us to statically partition a binary executable into single-entry multiple-exit regions and to identify potential parallelization candidates ahead of execution. Program regions that are not candidates for parallelization can thereby be excluded completely from runtime monitoring and dynamic recompilation. Azure's extremely low overhead is a direct consequence of this design.
Publisher
Association for Computing Machinery (ACM)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. DASS: Dynamic Adaptive Sub-Target Specialization;2023 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW);2023-10-17
2. BinRec;Proceedings of the Fifteenth European Conference on Computer Systems;2020-04-15
3. JIT Technology with C/C++: Feedback-Directed Dynamic Recompilation for Statically Compiled Languages;ACM T ARCHIT CODE OP;2013
4. JIT technology with C/C++;ACM Transactions on Architecture and Code Optimization;2013-12