On approximating the ideal random access machine by physical machines-Reference-Cited by-同舟云学术

On approximating the ideal random access machine by physical machines

Published:2009-08 Issue:5 Volume:56 Page:1-57
ISSN:0004-5411
Container-title:Journal of the ACM
language:en
Short-container-title:J. ACM

Author:

Bilardi Gianfranco¹,Ekanadham Kattamuri²,Pattnaik Pratap²

Affiliation:

1. Università di Padova, Padova, Italy

2. IBM T.J.Watson Research Center, Yorktown Heights, NY

Abstract

The capability of the Random Access Machine (RAM) to execute any instruction in constant time is not realizable, due to fundamental physical constraints on the minimum size of devices and on the maximum speed of signals. This work explores how well the ideal RAM performance can be approximated, for significant classes of computations, by machines whose building blocks have constant size and are connected at a constant distance. A novel memory structure is proposed, which is pipelined (can accept a new request at each cycle) and hierarchical , exhibiting optimal latency a ( x ) = O ( x 1/ d ) to address x , in d -dimensional realizations. In spite of block-transfer or other memory-pipeline capabilities, a number of previous machine models do not achieve a full overlap of memory accesses. These are examples of machines with explicit data movement . It is shown that there are direct-flow computations (without branches and indirect accesses) that require time superlinear in the number of instructions, on all such machines. To circumvent the explicit-data-movement constraints, the Speculative Prefetcher (SP) and the Speculative Prefetcher and Evaluator (SPE) processors are developed. Both processors can execute any direct-flow program in linear time. The SPE also executes in linear time a class of loop programs that includes many significant algorithms. Even quicksort, a somewhat irregular, recursive algorithm admits a linear-time SPE implementation. A relation between instructions called address dependence is introduced, which limits memory-access overlap and can lead to superlinear time, as illustrated with the classical merging algorithm.

Funder

Sixth Framework Programme

Ministero dell'Istruzione, dell'Università e della Ricerca

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software

Link

https://dl.acm.org/doi/pdf/10.1145/1552285.1552288

Reference38 articles.

1. On the Physical Design of PRAMs

2. A model for hierarchical memory

3. Hierarchical memory with block transfer

4. Allen R. and Kennedy K. 2002. Optimizing Compilers for Modern Architectures. Morgan Kauffman San Francisco. Allen R. and Kennedy K. 2002. Optimizing Compilers for Modern Architectures. Morgan Kauffman San Francisco.

5. Alpern B. Carter L. Feig E. and Selker T. 1994. The uniform memory hierarchy model of computation. Algorithmica 12 2/3 72--109. Alpern B. Carter L. Feig E. and Selker T. 1994. The uniform memory hierarchy model of computation. Algorithmica 12 2/3 72--109.

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimal On-Line Computation of Stack Distances for MIN and OPT;Proceedings of the Computing Frontiers Conference;2017-05-15

2. Outline of a Thick Control Flow Architecture;2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW);2016-10

3. Efficient Stack Distance Computation for a Class of Priority Replacement Policies;International Journal of Parallel Programming;2012-07-20

4. Efficient stack distance computation for priority replacement policies;Proceedings of the 8th ACM International Conference on Computing Frontiers - CF '11;2011