Affiliation:
1. CERFACS 42 AVENUE G. CORIOLIS 31057 TOULOUSE CEDEX,
FRANCE
2. HARWELL LABORATORY OXON OX11 ORA, ENGLAND
Abstract
We study various implementations of block Gaussian elimination on full matrices and examine their perfor mance on three vector supercomputers, the CRAY-2, the ETA-10P, and the IBM 3090-200/VF. We show that the use of Level 3 BLAS kernels allows portability without sacrifice of efficiency and that good speeds can be ob tained if tuned versions of the kernels are available. In deed our results show that without using any assembler language outside the kernels we can approach the per formance of assembler-coded routines on all machines.
Reference17 articles.
1. The WY Representation for Products of Householder Matrices
2. Bucher, I., and Jordan, T. 1984. Linear algebra programs for use on a vector computer with a secondary solid state storage device. In Advances in computer methods for partial differential equations, edited by E. R. Vichnevetsky and R. Stepleman. New Brunswick, N.J.: IMACS, pp. 546-550.
3. Dayde, M.J., and Duff, I.S. 1989. Use of Level 3 BLAS in LU factorization in a multitasking environment on three vector multiprocessors, the CRAY-2, the IBM 3090 VF, and the Alliant FX/80. Technical Report. Toulouse: CERFACS.
Cited by
19 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献