Affiliation:
1. ENS Paris, Paris, France
2. Jane Street, London, UK
3. Inria, Paris, France
4. University of Cambridge, Cambridge, UK
Abstract
We propose a new language feature for ML-family languages, the ability to selectively
unbox
certain data constructors, so that their runtime representation gets compiled away to just the identity on their argument. Unboxing must be statically rejected when it could introduce confusion, that is, distinct values with the same representation.
We discuss the use-case of big numbers, where unboxing allows to write code that is both efficient and safe, replacing either a safe but slow version or a fast but unsafe version. We explain the static analysis necessary to reject incorrect unboxing requests. We present our prototype implementation of this feature for the OCaml programming language, discuss several design choices and the interaction with advanced features such as Guarded Algebraic Datatypes.
Our static analysis requires expanding type definitions in type expressions, which is not necessarily normalizing in presence of recursive type definitions. In other words, we must decide normalization of terms in the first-order λ-calculus with recursion. We provide an algorithm to detect non-termination on-the-fly during reduction, with proofs of correctness and completeness. Our algorithm turns out to be closely related to the normalization strategy for macro expansion in the cpp preprocessor.
Publisher
Association for Computing Machinery (ACM)
Subject
Safety, Risk, Reliability and Quality,Software
Reference31 articles.
1. Ömer Sínan Ağacan. 2016. GHC unboxed sums. https://github.com/ghc/ghc/commit/714bebff44076061d0a719c4eda2cfd213b7ac3d
2. Noah Lev Bartell-Mangel. 2022. Filling a Niche: Using Spare Bits to Optimize Data Representations. https://www.noahlev.org/papers/popl22src-filling-a-niche.pdf POPL’22 student research presentation
3. Thaïs Baudon Gabriel Radanne and Laure Gonnord. 2023. Bit-Stealing Made Legal. In ICFP. https://doi.org/10.1145/3607858 10.1145/3607858
4. Aria Beingessner. 2015. Rust RFC 1230: More Exotic Enum Layout Optimizations. https://github.com/rust-lang/rfcs/issues/1230
5. Michael Benfield. 2022. rustc PR 94075: Use niche-filling optimization even when multiple variants have data. https://github.com/rust-lang/rust/pull/94075