Affiliation:
1. Università di Padova, Padova, Italy
2. IBM T.J.Watson Research Center, Yorktown Heights, NY
Abstract
The capability of the
Random Access Machine
(RAM) to execute any instruction in constant time is not realizable, due to fundamental physical constraints on the minimum size of devices and on the maximum speed of signals. This work explores how well the ideal RAM performance can be approximated, for significant classes of computations, by machines whose building blocks have constant size and are connected at a constant distance.
A novel memory structure is proposed, which is
pipelined
(can accept a new request at each cycle) and
hierarchical
, exhibiting optimal latency
a
(
x
) =
O
(
x
1/
d
) to address
x
, in
d
-dimensional realizations.
In spite of block-transfer or other memory-pipeline capabilities, a number of previous machine models do not achieve a full overlap of memory accesses. These are examples of machines with
explicit data movement
. It is shown that there are
direct-flow
computations (without branches and indirect accesses) that require time superlinear in the number of instructions, on all such machines.
To circumvent the explicit-data-movement constraints, the
Speculative Prefetcher
(SP) and the
Speculative Prefetcher and Evaluator
(SPE) processors are developed. Both processors can execute any
direct-flow
program in linear time. The SPE also executes in linear time a class of loop programs that includes many significant algorithms. Even quicksort, a somewhat irregular, recursive algorithm admits a linear-time SPE implementation. A relation between instructions called
address dependence
is introduced, which limits memory-access overlap and can lead to superlinear time, as illustrated with the classical merging algorithm.
Funder
Sixth Framework Programme
Ministero dell'Istruzione, dell'Università e della Ricerca
Publisher
Association for Computing Machinery (ACM)
Subject
Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Optimal On-Line Computation of Stack Distances for MIN and OPT;Proceedings of the Computing Frontiers Conference;2017-05-15
2. Outline of a Thick Control Flow Architecture;2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW);2016-10
3. Efficient Stack Distance Computation for a Class of Priority Replacement Policies;International Journal of Parallel Programming;2012-07-20
4. Efficient stack distance computation for priority replacement policies;Proceedings of the 8th ACM International Conference on Computing Frontiers - CF '11;2011