HAShCache

Author:

Patil Adarsh1,Govindarajan Ramaswamy1

Affiliation:

1. Indian Institute of Science, Bangalore, Karnataka, India

Abstract

Integrated Heterogeneous System (IHS) processors pack throughput-oriented General-Purpose Graphics Pprocessing Units (GPGPUs) alongside latency-oriented Central Processing Units (CPUs) on the same die sharing certain resources, e.g., shared last-level cache, Network-on-Chip (NoC), and the main memory. The demands for memory accesses and other shared resources from GPU cores can exceed that of CPU cores by two to three orders of magnitude. This disparity poses significant problems in exploiting the full potential of these architectures. In this article, we propose adding a large-capacity stacked DRAM, used as a shared last-level cache, for the IHS processors. However, adding the DRAMCache naively, leaves significant performance on the table due to the disparate demands from CPU and GPU cores for DRAMCache and memory accesses. In particular, the imbalance can significantly reduce the performance benefits that the CPU cores would have otherwise enjoyed with the introduction of the DRAMCache, necessitating a heterogeneity-aware management of this shared resource for improved performance. In this article, we propose three simple techniques to enhance the performance of CPU application while ensuring very little to no performance impact to the GPU. Specifically, we propose (i) PrIS , a prioritization scheme for scheduling CPU requests at the DRAMCache controller; (ii) ByE , a selective and temporal bypassing scheme for CPU requests at the DRAMCache; and (iii) Chaining , an occupancy controlling mechanism for GPU lines in the DRAMCache through pseudo-associativity. The resulting cache, Heterogeneity-Aware Shared DRAMCache (HAShCache), is heterogeneity-aware and can adapt dynamically to address the inherent disparity of demands in an IHS architecture. Experimental evaluation of the proposed HAShCache results in an average system performance improvement of 41% over a naive DRAMCache and over 200% improvement over a baseline system with no stacked DRAMCache.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Reference48 articles.

1. Intel Corporation. 2015. The Compute Architecture of Intel Processor Graphics Gen9. Retrieved from https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf. Intel Corporation. 2015. The Compute Architecture of Intel Processor Graphics Gen9. Retrieved from https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf.

2. NVIDIA Corporation. 2017. CUDA C Programming Guide. Retrieved from http://docs.nvidia.com/cuda/cuda-c-programming-guide/. NVIDIA Corporation. 2017. CUDA C Programming Guide. Retrieved from http://docs.nvidia.com/cuda/cuda-c-programming-guide/.

3. NVIDIA Corporation. 2017. Fastest Processors Smartphones and Tablets - Nvidia Tegra. Retrieved from http://www.nvidia.com/object/tegra.html. NVIDIA Corporation. 2017. Fastest Processors Smartphones and Tablets - Nvidia Tegra. Retrieved from http://www.nvidia.com/object/tegra.html.

4. Benedict R Gaster and Lee Howes. 2011. The Future of the APU - Braided Parallelism AMD Fusion Developer Summit 2011. Retrieved from http://developer.amd.com/wordpress/media/2013/06/2901_final.pdf. Benedict R Gaster and Lee Howes. 2011. The Future of the APU - Braided Parallelism AMD Fusion Developer Summit 2011. Retrieved from http://developer.amd.com/wordpress/media/2013/06/2901_final.pdf.

5. HSA Foundation. 2016. HSA Foundation Specification Version 1.1. Retrieved from http://www.hsafoundation.com/standards/. HSA Foundation. 2016. HSA Foundation Specification Version 1.1. Retrieved from http://www.hsafoundation.com/standards/.

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3