Affiliation:
1. University of Florida, Gainesville, FL
Abstract
Sliding-window applications, an important class of the digital-signal processing domain, are highly amenable to pipeline parallelism on field-programmable gate arrays (FPGAs). Although memory bandwidth often restricts parallelism for many applications, sliding-window applications can leverage custom buffers, referred to as sliding-window generators, that provide massive input bandwidth that far exceeds the capabilities of external memory. Previous work has introduced a variety of sliding-window generators, but those approaches typically generate at most one window per cycle, which significantly restricts parallelism. In this article, we address this limitation with a parallel sliding-window generator that can generate a configurable number of windows every cycle. Although in practice the number of parallel windows is limited by memory bandwidth, we show that even with common bandwidth limitations, the presented generator enables near-linear speedups up to 16x faster than previous FPGA studies that generate a single window per cycle, which were already in some cases faster than graphics-processing units and microprocessors.
Funder
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Reference17 articles.
1. A Configurable Processor Synthesis System
2. The computation of optical flow
3. C. S. S. Burrus and T. W. Parks. 1991. DFT/FFT and Convolution Algorithms: Theory and Implementation. John Wiley & Sons New York NY. C. S. S. Burrus and T. W. Parks. 1991. DFT/FFT and Convolution Algorithms: Theory and Implementation. John Wiley & Sons New York NY.
4. Real-Time Optical Flow Calculations on FPGA and GPU Architectures: A Comparison Study
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献