Affiliation:
1. Vanu Inc., Cambridge, MA
Abstract
The FFTW library for computing the discrete Fourier transform (DFT) has gained a wide acceptance in both academia and industry, because it provides excellent performance on a variety of machines (even competitive with or faster than equivalent libraries supplied by vendors). In FFTW, most of the performance-critical code was generated automatically by a special-purpose compiler, called
genfft
, that outputs C code. Written in Objective Caml,
genfft
can produce DFT programs for any input length, and it can specialize the DFT program for the common case where the input data are real instead of complex. Unexpectedly,
genfft
"discovered" algorithms that were previously unknown, and it was able to reduce the arithmetic complexity of some other existing algorithms. This paper describes the internals of this special-purpose compiler in some detail, and it argues that a specialized compiler is a valuable tool.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Reference39 articles.
1. Optimizing matrix multiply using PHiPAC
2. Architecture independent short vector FFTs
3. The implementation of the Cilk-5 multithreaded language
4. Eun-Jin Im. Optimizing the Performance of Sparse Matrix-Vector Multiplication. PhD thesis University of California at Berkeley 2000.]] Eun-Jin Im. Optimizing the Performance of Sparse Matrix-Vector Multiplication. PhD thesis University of California at Berkeley 2000.]]
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献