Affiliation:
1. Faculty of Mathematics, Physics, and Computer Science, Cracow University of Technology, ul. Warszawska 24, 31-155 Cracow, Poland
Abstract
Streaming SIMD Extensions (SSE) and Advanced Vector Extensions (AVX) are additional processor instruction sets available in contemporary personal computers, designed for vectorized floating point calculations. Unfortunately, in order to utilize the advantages of these instructions, one cannot rely on automatic options of high level language compilers. Instead, handwritten assembly language or intrinsic function call insertions are necessary. By using this idea an accelerated C[Formula: see text] code is devised, for solving (quasi-) block-tridiagonal linear algebraic equation systems by means of an extended Thomas algorithm. Speedups reaching 3.5 and 3 (relative to C[Formula: see text] without using SSE/AVX) are demonstrated for single and double precision calculations, respectively.
Publisher
World Scientific Pub Co Pte Lt
Subject
Computational Mathematics,Computer Science (miscellaneous)