Architectural Support for Sharing, Isolating and Virtualizing FPGA Resources-Reference-Cited by-同舟云学术

Architectural Support for Sharing, Isolating and Virtualizing FPGA Resources

Published:2024-05-21 Issue:2 Volume:21 Page:1-26
ISSN:1544-3566
Container-title:ACM Transactions on Architecture and Code Optimization
language:en
Short-container-title:ACM Trans. Archit. Code Optim.

Author:

Miliadis Panagiotis¹^ORCID,Theodoropoulos Dimitris¹^ORCID,Pnevmatikatos Dionisios¹^ORCID,Koziris Nectarios¹^ORCID

Affiliation:

1. National Technical University of Athens, Athens, Greece

Abstract

FPGAs are increasingly popular in cloud environments for their ability to offer on-demand acceleration and improved compute efficiency. Providers would like to increase utilization, by multiplexing customers on a single device, similar to how processing cores and memory are shared. Nonetheless, multi-tenancy still faces major architectural limitations including: (a) inefficient sharing of memory interfaces across hardware tasks (HT) exacerbated by technological limitations and peculiarities, (b) insufficient solutions for performance and data isolation and high quality of service, and (c) absent or simplistic allocation strategies to effectively distribute external FPGA memory across HT. This article presents a full-stack solution for enabling multi-tenancy on FPGAs. Specifically, our work proposes an intra-fpga virtualization layer to share FPGA interfaces and its resources across tenants. To achieve efficient inter-connectivity between virtual FPGAs (vFGPAs) and external interfaces, we employ a compact network-on-chip architecture to optimize resource utilization. Dedicated memory management units implement the concept of virtual memory in FPGAs, providing mechanisms to isolate the address space and enable memory protection. We also introduce a memory segmentation scheme to effectively allocate FPGA address space and enhance isolation through hardware-software support, while preserving the efficacy of memory transactions. We assess our solution on an Alveo U250 Data Center FPGA Card, employing 10 real-world benchmarks from the Rodinia and Rosetta suites. Our framework preserves the performance of HT from a non-virtualized environment, while enhancing the device aggregate throughput through resource sharing; up to 3.96x in isolated and up to 2.31x in highly congested settings, where an external interface is shared across four vFPGAs. Finally, our work ensures high-quality of service, with HT achieving up to 0.95x of their native performance, even when resource sharing introduces interference from other accelerators.

Funder

European High-Performance Computing Joint Undertaking (JU) project OPTIMA

European Union’s H2020 research and innovation programme project EuroEXA

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3648475

Reference56 articles.

1. Leap scratchpads

2. Alibaba. 2023. Overview - Elastic Compute Service - Alibaba Cloud Documentation Center. Retrieved Feb 19 2023 from https://www.alibabacloud.com/help/en/elastic-compute-service/latest/compute-optimized-type-family-with-fpga-overview

3. Amazon. 2023. Amazon EC2 F1 Instances. Retrieved Feb 19 2023 from https://aws.amazon.com/ec2/instance-types/f1/

4. CyGraph: A Reconfigurable Architecture for Parallel Breadth-First Search

5. A unified hardware/software runtime environment for FPGA-based reconfigurable computers using BORPH

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An Image-Retrieval Method Based on Cross-Hardware Platform Features;Applied System Innovation;2024-07-23