Affiliation:
1. PRISM LABORATORY, UNIVERSITY OF VERSAILLES, FRANCE
Abstract
Memory hierarchies are a key component in obtaining high performance on modern microprocessors. To satisfy the ever-increasing demand on data rate access, they are also becoming increasingly complex: multilevel caches, non-blocking caches, sophisticated instructions for supporting prefetch and cache control, etc. If all of these advanced features promise to offer large performance gains, they also generate in some cases performance “anomalies” (i.e. bad performance triggered by specific code patterns). For precisely locating and understanding these anomalies, a new set of microbenchmarks called WBTK is introduced. We show through systematic experimentation on Alpha 21264, Power4 and Itanium1 that this microbenchmark first allowed us to detect most of the anomalies encountered on simple BLAS1 type codes. Secondly, it led us to demonstrate that vectorization of memory access was an efficient workaround for most of these anomalies.
Subject
Hardware and Architecture,Theoretical Computer Science,Software
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献