FADO: Floorplan-Aware Directive Optimization Based on Synthesis and Analytical Models for High-Level Synthesis Designs on Multi-Die FPGAs

Author:

Du Linfeng1,Liang Tingyuan1,Zhou Xiaofeng1,Ge Jinming1,Li Shangkun2,Sinha Sharad3,Zhao Jieru4,Xie Zhiyao1,Zhang Wei1

Affiliation:

1. The Hong Kong University of Science and Technology, Kowloon, Hong Kong

2. Fudan University, Shanghai, China

3. Indian Institute of Technology Goa, Goa, India

4. Shanghai Jiao Tong University, Shanghai, China

Abstract

Multi-die FPGAs are widely adopted for large-scale accelerators, but optimizing high-level synthesis designs on these FPGAs faces two challenges. First, the delay caused by die-crossing nets creates an NP-hard floorplanning problem. Second, traditional directive optimization cannot consider resource constraints on each die or the timing issue incurred by the die-crossings. Furthermore, the high algorithmic complexity and the large scale lead to extended runtime for legalizing the floorplan of HLS designs under different directive configurations. To co-optimize the directives and floorplan of HLS designs on multi-die FPGAs, we formulate the co-search based on bin-packing variants and present two iterative optimization flows. The first (FADO 1.0) relies on a pre-built QoR library. It involves a greedy, latency-bottleneck-guided directive search and an incremental floorplan legalization. Compared with a global floorplanning solution, it takes 693X ∼ 4925X shorter search time and achieves 1.16X ∼ 8.78X better design performance, measured in workload execution time. To remove the time-consuming QoR library generation, the second flow (FADO 2.0) integrates an analytical QoR model and redesigns the directive search to accelerate convergence. Through experiments on mixed dataflow and non-dataflow designs, compared with 1.0, FADO 2.0 further yields a 1.40X better design performance on average after implementation on the Alveo U250 FPGA.

Publisher

Association for Computing Machinery (ACM)

Reference56 articles.

1. Md Mostofa Akbar, Eric G Manning, Gholamali C Shoja, and Shahadat Khan. 2001. Heuristic solutions for the multiple-choice multi-dimension knapsack problem. In International Conference on Computational Science. Springer, 659–668.

2. Elastic-DF: Scaling performance of DNN inference in FPGA clouds through automatic partitioning;Alonso Tobias;ACM Transactions on Reconfigurable Technology and Systems (TRETS),2021

3. Vaughn Betz and Jonathan Rose. 1997. VPR: A new packing, placement and routing tool for FPGA research. In International Workshop on Field Programmable Logic and Applications. Springer, 213–222.

4. Raghunandan Chaware, Kumar Nagarajan, and Suresh Ramalingam. 2012. Assembly and reliability challenges in 3D integration of 28nm FPGA die on a large high density 65nm passive interposer. In 2012 IEEE 62nd Electronic Components and Technology Conference. IEEE, 279–283.

5. PolySA

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3