Efficient and effective branch reordering using profile data

Author:

Yang Minghui1,Uh Gang-Ryung2,Whalley David B.3

Affiliation:

1. Oracle Corporation, Redwood Shores, CA

2. Boise State University, Boise, ID

3. Florida State University, Tallahassee, Florida

Abstract

The conditional branch has long been considered an expensive operation. The relative cost of conditional branches has increased as recently designed machines are now relying on deeper pipelines and higher multiple issue. Reducing the number of conditional branches executed often results in a substantial performance benefit. This paper describes a code-improving transformation to reorder sequences of conditional branches that compare a common variable to constants. The goal is to obtain an ordering where the fewest average number of branches in the sequence will be executed. First, sequences of branches that can be reordered are detected in the control flow. Second, profiling information is collected to predict the probability that each branch will transfer control out of the sequence. Third, the cost of performing each conditional branch is estimated. Fourth, the most beneficial ordering of the branches based on the estimated probability and cost is selected. The most beneficial ordering often includes the insertion of additional conditional branches that did not previously exist in the sequence. Finally, the control flow is restructured to reflect the new ordering. The results of applying the transformation are on average reductions of about 8% fewer instructions executed and 13% branches performed, as well as about a 4% decrease in execution time.

Publisher

Association for Computing Machinery (ACM)

Subject

Software

Reference18 articles.

1. Allen F. Cocke J. and Rustin R. 1971. A Catalogue of Optimizing Transformations. Prentice-Hall Englewood Cliffs NJ. Allen F. Cocke J. and Rustin R. 1971. A Catalogue of Optimizing Transformations. Prentice-Hall Englewood Cliffs NJ.

2. Clapp R. M. Duchesneau L. Volz A. Mudge T. N. and Schultze T. 1986. Toward real-time performance benchmarks for ada. Commun. ACM. 760--778. 10.1145/6424.6428 Clapp R. M. Duchesneau L. Volz A. Mudge T. N. and Schultze T. 1986. Toward real-time performance benchmarks for ada. Commun. ACM. 760--778. 10.1145/6424.6428

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Bungee jumps;Proceedings of the 48th International Symposium on Microarchitecture;2015-12-05

2. Compiler techniques to improve dynamic branch prediction for indirect jump and call instructions;ACM Transactions on Architecture and Code Optimization;2012-01

3. On Partitioning the Domain for Test Case Reusability (Short Paper);2008 The Eighth International Conference on Quality Software;2008-08

4. Ablego: a function outlining and partial inlining framework;Software: Practice and Experience;2007

5. Operation Reuse on Handheld Devices;Languages and Compilers for Parallel Computing;2004

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3