ZPP: A Dynamic Technique to Eliminate Cache Pollution in NoC based MPSoCs

Author:

Deb Dipika1ORCID,Jose John1ORCID

Affiliation:

1. Indian Institute of Technology Guwahati, India

Abstract

Data prefetching efficiently reduces the memory access latency in NUCA architectures as the Last Level Cache (LLC) is shared and distributed across multiple cores. But cache pollution generated by prefetcher reduces its efficiency by causing contention for shared resources such as LLC and the underlying network. The paper proposes Zero Pollution Prefetcher (ZPP) that eliminates cache pollution for NUCA architecture. For this purpose, ZPP uses L1 prefetcher and places the prefetched blocks in the data locations of LLC where modified blocks are stored. Since modified blocks in LLC are stale and request for such blocks are served from the exclusively owned private cache, their space unnecessary consumes power to maintain such stale data in the cache. The benefits of ZPP are (a) Eliminates cache pollution in L1 and LLC by storing prefetched blocks in LLC locations where stale blocks are stored. (b) Insufficient cache space is solved by placing prefetched blocks in LLC as LLCs are larger in size than L1 cache. This helps in prefetching more cache blocks, thereby increasing prefetch aggressiveness. (c) Increasing prefetch aggressiveness increases its coverage. (d) It also maintains an equivalent lookup latency to L1 cache for prefetched blocks. Experimentally it has been found that ZPP increases weighted speedup by 2.19x as compared to a system with no prefetching while prefetch coverage and prefetch accuracy increases by 50%, and 12%, respectively compared to the baseline. 1

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Reference41 articles.

1. N. Agarwal, T. Krishna, Li-Shiuan Peh, and N. K. Jha. 2009. GARNET: A detailed on-chip network model inside a full-system simulator. In IEEE International Symposium on Performance Analysis of Systems and Software. 33–42.

2. Cache coherence protocols: evaluation using a multiprocessor simulation model

3. M. Bakhshalipour, P. Lotfi-Kamran, and H. Sarbazi-Azad. 2018. Domino temporal data prefetcher. In International Symposium on High Performance Computer Architecture. 131–142.

4. Networks on chips: a new SoC paradigm

5. Rahul Bera, Anant V. Nori, Onur Mutlu, and Sreenivas Subramoney. 2019. DSPatch: Dual spatial pattern prefetcher. In Microarchitecture. 531–544.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3