Affiliation:
1. Microsoft Research, Redmond, WA
Abstract
With the emergence of software delivery platforms, code compression has become an important system component that strongly affects performance. This article presents PPMexe, a compression mechanism for program binaries that analyzes their syntax and semantics to achieve superior compression ratios. We use the generic paradigm of prediction by partial matching (PPM) as the foundation of our compression codec. PPMexe combines PPM with two preprocessing steps: (
i
) instruction rescheduling to improve prediction rates and (
ii
) heuristic partitioning of a program binary into streams with high autocorrelation. We improve the traditional PPM algorithm by (
iii
) using an additional alphabet of frequent variable-length supersymbols extracted from the input stream of fixed-length symbols. In addition, PPMexe features (
iv
) a low-overhead mechanism that enables decompression starting from an arbitrary instruction of the executable, a property pivotal for runtime software delivery. We implemented PPMexe for x86 binaries and tested it on several large applications. Binaries compressed using PPMexe were 18--24% smaller than files created using off-the-shelf PPMD, one of the best available compressors
Publisher
Association for Computing Machinery (ACM)
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献