Affiliation:
1. IIT Madras, India
2. Intel Labs, Santa Clara, CA
Abstract
Bitwidth-aware register allocation has caught the attention of researchers aiming to effectively reduce the number of variables spilled into memory. For general-purpose processors, this improves the execution time performance and reduces runtime memory requirements (which in turn helps in the compilation of programs targeted to systems with constrained memory). Additionally, bitwidth-aware register allocation has been effective in reducing power consumption in embedded processors. One of the key components of bitwidth-aware register allocation is the
variable packing
algorithm that packs multiple narrow-width variables into one physical register. Tallam and Gupta [2003] have proved that optimal variable packing is an NP-complete problem for arbitrary-width variables and have proposed an approximate solution.
In this article, we analyze the complexity of the variable packing problem and present three enhancements that improve the overall packing of variables. In particular, the improvements we describe are: (a)
Width Static Single Assignment
(W-SSA) form representation that splits the live range of a variable into several fixed-width live ranges (W-SSA) variables); (b) PoTR
Representation
- use of powers-of-two representation for bitwidth information for W-SSA variables. Our empirical results have shown that the associated bit wastage resulting from the overapproximation of the widths of variables to the nearest next power of two is a small fraction compared to the total number of bits in use (≈13%). The main advantage of this representation is that it leads to optimal variable packing in polynomial time; (c)
Combined Packing and Coalescing
- we discuss the importance of coalescing (combining variables whose live ranges do not interfere) in the context of variable packing and present an iterative algorithm to perform coalescing and packing of W-SSA variables represented in PoTR. Our experimental results show up to 76.00% decrease in the number of variables compared to the number of variables in the input program in Single Static Assignment (SSA) form. This reduction in the number of variables led to a significant reduction in dynamic spilling, packing, and unpacking instructions.
Funder
New Faculty Seed Grant
Indian Institute of Technology Madras
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Reference23 articles.
1. Alstrup S. Lauridsen P. W. and Thorup M. 1996. Dominators in linear time. DIKU Tech. rep. 35. University of Copenhagen. Alstrup S. Lauridsen P. W. and Thorup M. 1996. Dominators in linear time. DIKU Tech. rep. 35. University of Copenhagen.
2. Optimal spilling for CISC machines with few registers
3. Enhanced Bitwidth-Aware Register Allocation
4. Bitwise Benchmarks. 2013. http://www.cag.lcs.mit.edu/bitwise/bitwise_benchmarks.htm Bitwise Benchmarks. 2013. http://www.cag.lcs.mit.edu/bitwise/bitwise_benchmarks.htm
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Time series data encoding for efficient storage;Proceedings of the VLDB Endowment;2022-06
2. A polynomial time exact solution to the bit-aware register binding problem;Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction;2022-03-18
3. A Vector-Length Agnostic Compiler for the Connex-S Accelerator with Scratchpad Memory;ACM Transactions on Embedded Computing Systems;2020-11-30
4. Survey on Combinatorial Register Allocation and Instruction Scheduling;ACM Computing Surveys;2020-05-31
5. Bytewise Register Allocation;Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems;2015-06