Improving Performance of Hardware Accelerators by Optimizing Data Movement: A Bioinformatics Case Study
-
Published:2023-01-24
Issue:3
Volume:12
Page:586
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Knoben Peter1, Alachiotis Nikolaos1ORCID
Affiliation:
1. Faculty of EEMCS, University of Twente, 7522 NB Enschede, The Netherlands
Abstract
Modern hardware accelerator cards create an accessible platform for developers to reduce execution times for computationally expensive algorithms. The most widely used systems, however, have dedicated memory spaces, resulting in the processor having to transfer data to the accelerator-card memory space before the computation can be executed. Currently, the performance increase from using an accelerator card for data-intensive algorithms is limited by the data movement. To this end, this work aims to reduce the effect of data movement and improve overall performance by systematically caching data on the accelerator card. We designed a software-controlled split cache where data are cached on the accelerator and assessed its efficacy using a data-intensive Bioinformatics application that infers the evolutionary history of a set of organisms by constructing phylogenetic trees. Our results revealed that software-controlled data caching on a datacenter-grade FPGA accelerator card reduced the overhead of data movement by 90%. This resulted in a reduction of the total execution time between 32% and 40% for the entire application when phylogenetic trees of various sizes were constructed.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference24 articles.
1. Computation vs. memory systems: Pinning down accelerator bottlenecks;Kim;Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),2012 2. Wei, X., Liang, Y., and Cong, J. (2019, January 2–6). Overcoming data transfer bottlenecks in FPGA-based DNN accelerators via layer conscious memory management. Proceedings of the 56th Annual Design Automation Conference, Las Vegas, NV, USA. 3. Zhang, X., Wang, J., Zhu, C., Lin, Y., Xiong, J., Hwu, W.M., and Chen, D. (2018, January 5–8). DNNBuilder: An automated tool for building high-performance DNN hardware accelerators for FPGAs. Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD, San Diego, CA, USA. 4. Xiao, Q., Liang, Y., Lu, L., Yan, S., and Tai, Y.W. (2017, January 18–22). Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs. Proceedings of the Design Automation Conference, Austin, TX, USA. Part 12828. 5. AlSaber, N., and Kulkarni, M. (2013, January 10–14). Semcache: Semantics-aware caching for efficient gpu offloading. Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, Eugene, OR, USA.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|