Affiliation:
1. MIT Laboratory for Computer Science
Abstract
This article presents asymptotically optimal algorithms for rectangular matrix transpose, fast Fourier transform (FFT), and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these algorithms are
cache oblivious
: no variables dependent on hardware parameters, such as cache size and cache-line length, need to be tuned to achieve optimality. Nevertheless, these algorithms use an optimal amount of work and move data optimally among multiple levels of cache. For a cache with size
M
and cache-line length
B
where
M
=
Ω
(
B
2
), the number of cache misses for an
m
×
n
matrix transpose is
Θ
(1 +
mn
/
B
). The number of cache misses for either an
n
-point FFT or the sorting of
n
numbers is
Θ
(1 + (
n
/
B
)(1 + log
M n
)). We also give a
Θ
(
mnp
)-work algorithm to multiply an
m
×
n
matrix by an
n
×
p
matrix that incurs
Θ
(1 + (
mn
+
np
+
mp
)/
B
+
mnp
/
B
√
M
) cache faults.
We introduce an “ideal-cache” model to analyze our algorithms. We prove that an optimal cache-oblivious algorithm designed for two levels of memory is also optimal for multiple levels and that the assumption of optimal replacement in the ideal-cache model can be simulated efficiently by LRU replacement. We offer empirical evidence that cache-oblivious algorithms perform well in practice.
Funder
National Science Foundation
Division of Computing and Communication Foundations
Division of Computer and Network Systems
Defense Advanced Research Projects Agency
Publisher
Association for Computing Machinery (ACM)
Subject
Mathematics (miscellaneous)
Reference41 articles.
1. The input/output complexity of sorting and related problems
2. A model for hierarchical memory
3. Hierarchical memory with block transfer
4. Aho A. V. Hopcroft J. E. and Ullman J. D. 1974. The Design and Analysis of Computer Algorithms. Addison-Wesley Publishing Company. Aho A. V. Hopcroft J. E. and Ullman J. D. 1974. The Design and Analysis of Computer Algorithms . Addison-Wesley Publishing Company.
5. Uniform memory hierarchies
Cited by
90 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献