Affiliation:
1. IBM Research, Hawthorne, NY, USA
Abstract
Programmers are turning to radical architectures such as reconfigurable hardware (FPGAs) to achieve performance. But such systems, programmed at a very low level in languages with impoverished abstractions, are orders of magnitude more complex to use than conventional CPUs. The continued exponential increase in transistors, combined with the desire to implement ever more sophisticated algorithms, makes it imperative that such systems be programmed at much higher levels of abstraction. One of the fundamental high-level language features is automatic memory management in the form of garbage collection.
We present the first implementation of a complete garbage collector in hardware (as opposed to previous "hardware-assist" techniques), using an FPGA and its on-chip memory. Using a completely concurrent snapshot algorithm, it provides single-cycle access to the heap, and never stalls the mutator for even a single cycle, achieving a deterministic mutator utilization (MMU) of 100%.
We have synthesized the collector to hardware and show that it never consumes more than 1% of the logic resources of a high-end FPGA. For comparison we also implemented explicit (malloc/free) memory management, and show that real-time collection is about 4% to 17% slower than malloc, with comparable energy consumption. Surprisingly, in hardware real-time collection is superior to stop-the-world collection on every performance axis, and even for stressful micro-benchmarks can achieve 100% MMU with heaps as small as 1.01 to 1.4 times the absolute minimum.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Towards Hardware Accelerated Garbage Collection with Near-Memory Processing;2022 IEEE High Performance Extreme Computing Conference (HPEC);2022-09-19
2. Synthesized In-BramGarbage Collection for Accelerators with Immutable Memory;2022 32nd International Conference on Field-Programmable Logic and Applications (FPL);2022-08
3. Cephalopode: A custom processor aimed at functional language execution for IoT devices.;2020 18th ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE);2020-12-02
4. Transactional Sapphire;ACM Transactions on Programming Languages and Systems;2018-12-31
5. Direct garbage collection: two-fold speedup for managed language embedded systems;International Journal of Embedded Systems;2018