Affiliation:
1. Baidu Research and Scalable Machines Research
2. College of William and Mary
Abstract
Shared-memory parallel programs routinely suffer from false sharing---a performance degradation caused by different threads accessing different variables that reside on the same CPU cacheline and at least one variable is modified. State-of-the-art tools detect false sharing via a heavyweight process of logging memory accesses and feeding the ensuing access traces to an offline cache simulator. We have developed F
eather
, a lightweight, on-the-fly false-sharing detection tool. F
eather
achieves low overhead by exploiting two hardware features ubiquitous in commodity CPUs: the
performance monitoring units
(PMU) and
debug registers.
Additionally, F
eather
is a first-of-its-kind tool to detect false sharing in multi-process applications that use shared memory. F
eather
allowed us to scale false-sharing detection to myriad codes. F
eather
detected several false-sharing cases in important multi-core and multi-process codes including previous PPoPP artifacts. Eliminating false sharing resulted in dramatic (up to 16x) speedups.
Funder
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Reference58 articles.
1. Exploiting hardware performance counters with flow and context sensitive profiling
2. Boost developer community. 2012. Boost C++ Libraries. https://sourceforge.net/projects/boost/files/boost/1.49.0/. (2012). Boost developer community. 2012. Boost C++ Libraries. https://sourceforge.net/projects/boost/files/boost/1.49.0/. (2012).
3. An Efficient Abortable-locking Protocol for Multi-level NUMA Systems
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献