Affiliation:
1. Nanyang Technological University, Singapore
Abstract
This article presents a comprehensive survey of time-multiplexed (TM) FPGA overlays from the research literature. These overlays are categorized based on their implementation into two groups: processor-based overlays, as their implementation follows that of conventional silicon-based microprocessors, and; CGRA-like overlays, with either an array of interconnected processor-based functional units or medium-grained arithmetic functional units. Time-multiplexing the overlay allows it to change its behavior with a cycle-by-cycle execution of the application kernel, thus allowing better sharing of the limited FPGA hardware resource. However, most TM overlays suffer from large resource overheads, due to either the underlying processor-like architecture (for processor-based overlays) or due to the routing array and instruction storage requirements (for CGRA-like overlays). Reducing the area overhead for CGRA-like overlays, specifically that required for the routing network, and better utilizing the hard macros in the target FPGA are active areas of research.
Funder
Ministry of Education (MoE), Singapore
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications
Reference81 articles.
1. COBHAM GAISLER AB. 2017. GRLIB IP core user’s manual. (2017). COBHAM GAISLER AB. 2017. GRLIB IP core user’s manual. (2017).
2. FGPU
3. Altera. 2016. Nios II processor reference handbook. Altera. 2016. Nios II processor reference handbook.
4. FlexGrip: A soft GPGPU for FPGAs
5. Enabling GPGPU Low-Level Hardware Explorations with MIAOW
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A Visionary Look at the Security of Reconfigurable Cloud Computing;Proceedings of the IEEE;2023-12
2. Modular VNF Components Acceleration With FPGA Overlays;IEEE Transactions on Network and Service Management;2023-03
3. A Scalable Many-core Overlay Architecture on an HBM2-enabled Multi-Die FPGA;ACM Transactions on Reconfigurable Technology and Systems;2023-01-18
4. TelaMalloc: Efficient On-Chip Memory Allocation for Production Machine Learning Accelerators;Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1;2022-12-19
5. An efficient FPGA overlay for MPI-2 RMA parallel applications;2022 20th IEEE Interregional NEWCAS Conference (NEWCAS);2022-06-19