Affiliation:
1. University of Campinas
2. Microsoft Research
Abstract
Zombie is an endurance management framework that enables a variety of error correction mechanisms to extend the lifetimes of memories that suffer from bit failures caused by wearout, such as phase-change memory (PCM). Zombie supports both single-level cell (SLC) and multi-level cell (MLC) variants. It extends the lifetime of blocks in working memory pages (primary blocks) by pairing them with spare blocks, i.e., working blocks in pages that have been disabled due to exhaustion of a single block's error correction resources, which would be 'dead' otherwise. Spare blocks adaptively provide error correction resources to primary blocks as failures accumulate over time. This reduces the waste caused by early block failures, making working blocks in discarded pages a useful resource. Even though we use PCM as the target technology, Zombie applies to any memory technology that suffers stuck-at cell failures.
This paper describes the Zombie framework, a combination of two new error correction mechanisms (ZombieXOR for SLC and ZombieMLC for MLC) and the extension of two previously proposed SLC mechanisms (ZombieECP and ZombieERC). The result is a 58% to 92% improvement in endurance for Zombie SLC memory and an even more impressive 11x to 17x improvement for ZombieMLC, both with performance overheads of only 0.1% when memories using prior error correction mechanisms reach end of life.
Funder
Conselho Nacional de Desenvolvimento Científico e Tecnológico
FAEPEX
Publisher
Association for Computing Machinery (ACM)
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Hard Error Correction in STT-MRAM;2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC);2024-01-22
2. ECC cache;Proceedings of the 39th International Conference on Computer-Aided Design;2020-11-02
3. Block Cooperation;ACM Transactions on Architecture and Code Optimization;2018-09-30
4. REMAP;Proceedings of the International Symposium on Memory Systems;2017-10-02
5. Balancing the Lifetime and Storage Overhead on Error Correction for Phase Change Memory;PLOS ONE;2015-07-09