A compiler approach to managing storage and memory bandwidth in configurable architectures-Reference-Cited by-同舟云学术

A compiler approach to managing storage and memory bandwidth in configurable architectures

Published:2008-09 Issue:4 Volume:13 Page:1-26
ISSN:1084-4309
Container-title:ACM Transactions on Design Automation of Electronic Systems
language:en
Short-container-title:ACM Trans. Des. Autom. Electron. Syst.

Author:

Baradaran Nastaran¹,Diniz Pedro C.²

Affiliation:

1. University of Southern California/Information Sciences Institute, Los Angeles, California

2. Instituto Superior Técnico/Technical University of Lisbon/INESC-ID

Abstract

Configurable architectures offer the unique opportunity of realizing hardware designs tailored to the specific data and computational patterns of an application code. Customizing the storage structures is becoming increasingly important in mitigating the continuing gap between memory latencies and internal computing speeds. In this article we describe and evaluate a compiler algorithm that maps the arrays of a loop-based computation to internal storage structures, either RAM blocks or discrete registers. Our objective is to minimize the overall execution time while considering the capacity and bandwidth constraints of the storage resources. The novelty of our approach lies in creating a single framework that combines high-level compiler techniques with lower-level scheduling information for mapping the data. We illustrate the benefits of our approach for a set of image/signal processing kernels using a Xilinx Virtex™ Field-Programmable Gate Array (FPGA). Our algorithm leads to faster designs compared to the state-of-the-art custom data layout mapping technique, in some instances using less storage. When compared to hand-coded designs, our results are comparable in terms of execution time and resources, but are derived in a minute fraction of the design time.

Funder

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications

Link

https://dl.acm.org/doi/pdf/10.1145/1391962.1391969

Reference20 articles.

1. A Register Allocation Algorithm in the Presence of Scalar Replacement for Fine-Grain Configurable Architectures

2. Baradaran N. Diniz P. and Park J. 2004. Extending the applicability of scalar replacement to multiple induction variables. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing. Lecture Notes in Computer Science Springer-Verlag 455--469. 10.1007/11532378_32 Baradaran N. Diniz P. and Park J. 2004. Extending the applicability of scalar replacement to multiple induction variables. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing. Lecture Notes in Computer Science Springer-Verlag 455--469. 10.1007/11532378_32

3. Catthoor F. Danckaert K. Kulkarni K. Brockmeyer E. Kjeldsberg P. van Achteren T. and Omnes T. 2002. Data Access and Storage Management for Embedded Programmable Processors. Kluwer Academic. Catthoor F. Danckaert K. Kulkarni K. Brockmeyer E. Kjeldsberg P. van Achteren T. and Omnes T. 2002. Data Access and Storage Management for Embedded Programmable Processors. Kluwer Academic.

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. COSMOS;ACM Transactions on Embedded Computing Systems;2017-10-10

2. High-Level Synthesis for Semi-Global Matching: Is the Juice Worth the Squeeze?;IEEE Access;2017

3. System-Level Optimization of Accelerator Local Memory for Heterogeneous Systems-on-Chip;IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems;2016

4. Memory Partitioning in the Limit;International Journal of Parallel Programming;2015-10-26

5. Memory Interface Design for 3D Stencil Kernels on a Massively Parallel Memory System;ACM Transactions on Reconfigurable Technology and Systems;2015-10