Affiliation:
1. University of Southern California/Information Sciences Institute, Los Angeles, California
2. Instituto Superior Técnico/Technical University of Lisbon/INESC-ID
Abstract
Configurable architectures offer the unique opportunity of realizing hardware designs tailored to the specific data and computational patterns of an application code. Customizing the storage structures is becoming increasingly important in mitigating the continuing gap between memory latencies and internal computing speeds. In this article we describe and evaluate a compiler algorithm that maps the arrays of a loop-based computation to internal storage structures, either RAM blocks or discrete registers. Our objective is to minimize the overall execution time while considering the capacity and bandwidth constraints of the storage resources. The novelty of our approach lies in creating a single framework that combines high-level compiler techniques with lower-level scheduling information for mapping the data. We illustrate the benefits of our approach for a set of image/signal processing kernels using a Xilinx Virtex™ Field-Programmable Gate Array (FPGA). Our algorithm leads to faster designs compared to the state-of-the-art
custom data layout
mapping technique, in some instances using less storage. When compared to hand-coded designs, our results are comparable in terms of execution time and resources, but are derived in a minute fraction of the design time.
Funder
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. COSMOS;ACM Transactions on Embedded Computing Systems;2017-10-10
2. High-Level Synthesis for Semi-Global Matching: Is the Juice Worth the Squeeze?;IEEE Access;2017
3. System-Level Optimization of Accelerator Local Memory for Heterogeneous Systems-on-Chip;IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems;2016
4. Memory Partitioning in the Limit;International Journal of Parallel Programming;2015-10-26
5. Memory Interface Design for 3D Stencil Kernels on a Massively Parallel Memory System;ACM Transactions on Reconfigurable Technology and Systems;2015-10