Affiliation:
1. Indiana University, Bloomington, IN
Abstract
Suppose the bits of a computer word are partitioned into
d
disjoint sets, each of which is used to represent one of a
d
-tuple of cartesian indices into
d
-dimensional space. Then, regardless of the partition, simple group operations and comparisons can be implemented for each index on a conventional processor in a sequence of two or three register operations.These indexings allow any blocked algorithm from linear algebra to use some non-standard matrix orderings that increase locality and enhance their performance. The underlying implementations were designed for alternating bit postitions to index Morton-ordered matrices, but they apply, as well, to any bit partitioning. A hybrid ordering of the elements of a matrix becomes possible, therefore, with row-/column-major ordering within cache-sized blocks and Morton ordering of those blocks, themselves. So, one can enjoy the temporal locality of nested blocks, as well as compiler optimizations on row- or column-major ordering in base blocks.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献