Affiliation:
1. THINKING MACHINES CORPORATION AND HARVARD UNIVERSITY
CAMBRIDGE, MASSACHUSETTS
Abstract
An all-to-all broadcast algorithm that exploits concur rent communication on all channels of the Connection Machine system CM-200 binary cube network is de scribed. Issues in integrating a physical all-to-all broad cast between processing nodes into a language envi ronment using a global address space are discussed. Timings for the physical broadcast between nodes and for the virtual broadcast are given. The peak data transfer rate for the physical broadcast on a CM-200 is 5.9 gigabytes/sec, and the peak rate for the virtual broadcast is 31 gigabytes/sec. Array reshaping is an effective performance optimization technique. An ex ample is given where reshaping improved perfor mance by a factor of 7 by reducing the amount of local data motion. We also show how to exploit symmetry for computation of an interaction matrix using the all- to-all broadcast function. Further optimizations are suggested for N-body-type calculations. Using the all- to-all broadcast function, a peak rate of 9.3 GFLOPS/ sec has been achieved for the N-body computations in 32-bit precision on a 2,048 node Connection Machine system CM-200.
Reference13 articles.
1. A Digital Orrery
2. Optimal communication algorithms for hypercubes
3. Brunet, J.P., Mesirov, J.P., and Edelman, A. 1990. An optimal hypercube direct N-body solver on the Connection Machine. In Supercomputing 90. Los Alamitos, Calif. IEEE Computer Society Press, pp. 748-752.
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献