TAPA: A Scalable Task-parallel Dataflow Programming Framework for Modern FPGAs with Co-optimization of HLS and Physical Design

Author:

Guo Licheng1ORCID,Chi Yuze1ORCID,Lau Jason1ORCID,Song Linghao1ORCID,Tian Xingyu2ORCID,Khatti Moazin2ORCID,Qiao Weikang1ORCID,Wang Jie1ORCID,Ustun Ecenur3ORCID,Fang Zhenman2ORCID,Zhang Zhiru3ORCID,Cong Jason1ORCID

Affiliation:

1. University of California Los Angeles, USA

2. Simon Fraser University, Canada

3. Cornell University, USA

Abstract

In this article, we propose TAPA, an end-to-end framework that compiles a C++ task-parallel dataflow program into a high-frequency FPGA accelerator. Compared to existing solutions, TAPA has two major advantages. First, TAPA provides a set of convenient APIs that allows users to easily express flexible and complex inter-task communication structures. Second, TAPA adopts a coarse-grained floorplanning step during HLS compilation for accurate pipelining of potential critical paths. In addition, TAPA implements several optimization techniques specifically tailored for modern HBM-based FPGAs. In our experiments with a total of 43 designs, we improve the average frequency from 147 MHz to 297 MHz (a 102% improvement) with no loss of throughput and a negligible change in resource utilization. Notably, in 16 experiments, we make the originally unroutable designs achieve 274 MHz, on average. The framework is available at https://github.com/UCLA-VAST/tapa and the core floorplan module is available at https://github.com/UCLA-VAST/AutoBridge

Funder

Intel/NSF CAPA program, the NSF NeuroNex Award

NIH

NSERC Discovery

CFI John R. Evans Leaders Fund, NSF projects

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. PASTA: Programming and Automation Support for Scalable Task-Parallel HLS Programs on Modern Multi-Die FPGAs;ACM Transactions on Reconfigurable Technology and Systems;2024-08-05

2. HiHiSpMV: Sparse Matrix Vector Multiplication with Hierarchical Row Reductions on FPGAs with High Bandwidth Memory;2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM);2024-05-05

3. HiSpMV: Hybrid Row Distribution and Vector Buffering for Imbalanced SpMV Acceleration on FPGAs;Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays;2024-04

4. Scheduling and Physical Design;Proceedings of the 2024 International Symposium on Physical Design;2024-03-12

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3