Enhancing computation-to-core assignment with physical location information-Reference-Cited by-同舟云学术

Enhancing computation-to-core assignment with physical location information

Published:2018-12-02 Issue:4 Volume:53 Page:312-327
ISSN:0362-1340
Container-title:ACM SIGPLAN Notices
language:en
Short-container-title:SIGPLAN Not.

Author:

Kislal Orhan¹,Kotra Jagadish¹,Tang Xulong¹,Kandemir Mahmut Taylan¹,Jung Myoungsoo²

Affiliation:

1. Pennsylvania State University, USA

2. Yonsei University, South Korea

Abstract

Going beyond a certain number of cores in modern architectures requires an on-chip network more scalable than conventional buses. However, employing an on-chip network in a manycore system (to improve scalability) makes the latencies of the data accesses issued by a core non-uniform. This non-uniformity can play a significant role in shaping the overall application performance. This work presents a novel compiler strategy which involves exposing architecture information to the compiler to enable an optimized computation-to-core mapping. Specifically, we propose a compiler-guided scheme that takes into account the relative positions of (and distances between) cores, last-level caches (LLCs) and memory controllers (MCs) in a manycore system, and generates a mapping of computations to cores with the goal of minimizing the on-chip network traffic. The experimental data collected using a set of 21 multi-threaded applications reveal that, on an average, our approach reduces the on-chip network latency in a 6×6 manycore system by 38.4% in the case of private LLCs, and 43.8% in the case of shared LLCs. These improvements translate to the corresponding execution time improvements of 10.9% and 12.7% for the private LLC and shared LLC based systems, respectively.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3296979.3192386

Reference67 articles.

1. 2007. Intel teralops research chip. goo.gl/lewCk7. 2007. Intel teralops research chip. goo.gl/lewCk7.

2. 2009. Intel Single-cloud chip. goo.gl/RSJjfg. 2009. Intel Single-cloud chip. goo.gl/RSJjfg.

3. 2012. minighost. https://mantevo.org/default.php. 2012. minighost. https://mantevo.org/default.php.

4. 2012. The Architecture and Performance of the TILE-Gx Processor Family. http://www.tilera.com/products/processors/TILE-Gx_Family. 2012. The Architecture and Performance of the TILE-Gx Processor Family. http://www.tilera.com/products/processors/TILE-Gx_Family.

5. 2013. CORAL Benchmarks. htps://asc.llnl.gov/CORAL-benchmarks/ 2013. CORAL Benchmarks. htps://asc.llnl.gov/CORAL-benchmarks/

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Distance-in-time versus distance-in-space;Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation;2021-06-18