Affiliation:
1. Center for Reliable and High Performance Computing, Univeristy of Illinois at Urbana-Champaign, 1308 W. Main Street, Urbana, IL
Abstract
This study presents a characterization of (1) the global memory and interconnection network contention overhead, (2) the operating system overheads, and (3) the runtime system parallelization overheads for the Cedar shared-memory multiprocessor. The measurements were obtained using five representative compute-intensive, scientific, loop parallel applications from the Perfect Benchmark Suite. The overheads were measured for a range of Cedar configurations from 1 processor to the full 4-cluster/32-processor configuration, thus characterizing the effect of this scaling on the overheads. For the full 4-cluster Cedar, the operating system overhead was found to constitute 5--21% of the total completion time of an application. The parallelization overhead accounts for 10--25% of the application completion time, and the overhead due to global memory and network contention contributes 8--21% of the application completion time.
Publisher
Association for Computing Machinery (ACM)
Reference16 articles.
1. Behavioral characterization of multiprocessor memory systems: a case study
2. Vector performance analysis of the NEC SX-2
3. R. H. Saavedra-Barrera "Machine characterization and benchmark performance prediction " Tech. Rpt. UCB//CSD 88//437 Univ. of California at Berkeley June 1988. R. H. Saavedra-Barrera "Machine characterization and benchmark performance prediction " Tech. Rpt. UCB//CSD 88//437 Univ. of California at Berkeley June 1988.
4. The DASH prototype: Logic overhead and performance
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Non-preemptive speed scaling;Journal of Scheduling;2013-01-31
2. On the Value of Preemption in Scheduling;Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques;2006