Embedded RAIDs-on-chip for bus-based chip-multiprocessors

Author:

Bathen Luis Angel D.1,Dutt Nikil D.1

Affiliation:

1. University of California, Irvine, CA

Abstract

The dual effects of larger die sizes and technology scaling, combined with aggressive voltage scaling for power reduction, increase the error rates for on-chip memories. Traditional on-chip memory reliability techniques (e.g., ECC) incur significant power and performance overheads. In this article, we propose a low-power-and-performance-overhead Embedded RAID (E-RAID) strategy and present Embedded RAIDs-on-Chip (E-RoC), a distributed dynamically managed reliable memory subsystem for bus-based Chip-Multiprocessors. E-RoC achieves reliability through redundancy by optimizing RAID-like policies tuned for on-chip distributed memories. We achieve on-chip reliability of memories through the use of Distributed Dynamic ScratchPad Allocatable Memories (DSPAMs) and their allocation policies. We exploit aggressive voltage scaling to reduce power consumption overheads due to parallel DSPAM accesses, and rely on the E-RoC Manager to automatically handle any resulting voltage-scaling-induced errors. We demonstrate how E-RAIDs can further enhance the fault tolerance of traditional memory reliability approaches by designing E-RAID levels that exploit ECC. Finally, we show the power and flexibility of the E-RoC concept by showing the benefits of having a heterogeneous E-RAID levels that fit each application's needs (fault tolerance, power/energy, performance). Our experimental results on CHStone/Mediabench II benchmarks show that our E-RAID levels converge to 100% error-free data rates much faster than traditional ECC approaches. Moreover, E-RAID levels that exploit ECC can guarantee 99.9% error-free data rates at ultra low Vdd on average, where as traditional ECC approaches were able to attain at most 99.1% error-free data rates. We observe an average of 22% dynamic power consumption increase by using traditional ECC approaches with respect to the baseline (non-voltage scaled SPMs), whereas our E-RAID levels are able to save dynamic power consumption by an average of 27% (w.r.t. the same non-voltage scaled SPMs baseline), while incurring worst-case 2% higher performance overheads than traditional ECC approaches. By voltage scaling the memories, we see that traditional ECC approaches are able to save static energy by 6.4% (average), where as our E-RAID approaches achieve 23.4% static energy savings (average). Finally, we observe that mixing E-RAID levels allows us to further reduce the dynamic power consumption by up to 55.5% at the cost of an average 5.6% increase in execution time over traditional approaches.

Funder

Division of Computing and Communication Foundations

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. A Cache-Assisted Scratchpad Memory for Multiple-Bit-Error Correction;IEEE Transactions on Very Large Scale Integration (VLSI) Systems;2016-11

2. In-Scratchpad Memory Replication;ACM Transactions on Design Automation of Electronic Systems;2015-09-28

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3