Affiliation:
1. Imperial College London, UK
2. University of California at Santa Cruz, USA
Abstract
Heterogeneous CPU/FPGA devices, in which a CPU and an FPGA can execute together while sharing memory, are becoming popular in several computing sectors. In this paper, we study the shared-memory semantics of these devices, with a view to providing a firm foundation for reasoning about the programs that run on them. Our focus is on Intel platforms that combine an Intel FPGA with a multicore Xeon CPU. We describe the weak-memory behaviours that are allowed (and observable) on these devices when CPU threads and an FPGA thread access common memory locations in a fine-grained manner through multiple channels. Some of these behaviours are familiar from well-studied CPU and GPU concurrency; others are weaker still. We encode these behaviours in two formal memory models: one operational, one axiomatic. We develop executable implementations of both models, using the CBMC bounded model-checking tool for our operational model and the Alloy modelling language for our axiomatic model. Using these, we cross-check our models against each other via a translator that converts Alloy-generated executions into queries for the CBMC model. We also validate our models against actual hardware by translating 583 Alloy-generated executions into litmus tests that we run on CPU/FPGA devices; when doing this, we avoid the prohibitive cost of synthesising a hardware design per litmus test by creating our own 'litmus-test processor' in hardware. We expect that our models will be useful for low-level programmers, compiler writers, and designers of analysis tools. Indeed, as a demonstration of the utility of our work, we use our operational model to reason about a producer/consumer buffer implemented across the CPU and the FPGA. When the buffer uses insufficient synchronisation -- a situation that our model is able to detect -- we observe that its performance improves at the cost of occasional data corruption.
Funder
Engineering and Physical Sciences Research Council
Publisher
Association for Computing Machinery (ACM)
Subject
Safety, Risk, Reliability and Quality,Software
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Challenges in Empirically Testing Memory Persistency Models;Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results;2024-04-14
2. Systematic Literature Review on Machine Learning and its Impact on APIs Deployment;Computación y Sistemas;2023-12-27
3. Simulating Operational Memory Models Using Off-the-Shelf Program Analysis Tools;IEEE Transactions on Software Engineering;2023-12
4. Building GPU TEEs using CPU Secure Enclaves with GEVisor;Proceedings of the 2023 ACM Symposium on Cloud Computing;2023-10-30
5. Reasoning about MLIR Semantics through Effects and Handlers;Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis;2023-07-12