Affiliation:
1. Tohoku University, Sendai, Japan
Abstract
This paper presents the detailed design of a custom computing machine for fully-streamed LBM computation on multiple FPGAs, and evaluates its efficiency with prototype implementation. We design a unit for completely streamed computation including boundary treatment with a newly introduced cell attribute. Experimental results demonstrate that the proposed machine achieves high utilization of PEs, 99 % of the peak performance, for one and two FPGAs computing a large lattice. This is due to our fully-streamed design to allow all arithmetic units to be efficienly utilized with a constant memory bandwidth, and the architecture to exploit a low-latency accelerator domain network (ADN) of a tightly-coupled FPGA cluster for scalable computation.
Publisher
Association for Computing Machinery (ACM)
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献