Affiliation:
1. Iowa State University, Iowa, USA
Abstract
Memory system reliability is increasingly a concern as memory cell density and capacity continue to grow. The conventional approach is to use redundant memory bits for error detection and correction, with significant storage, cost and power overheads. In this paper, we propose a novel, system-level scheme called MemGuard for memory error detection. With OS-based checkpointing, it is also able to recover program execution from memory errors. The memory error detection of MemGuard is motivated by memory integrity verification using log hashes. It is much stronger than SECDED in error detection, incurs negligible hardware cost and energy overhead and no storage overhead, and is compatible with various memory organizations. It may play the role of ECC memory in consumer-level computers and mobile devices, without the shortcomings of ECC memory. In server computers, it may complement SECDED ECC or Chipkill Correct by providing even stronger error detectio
We have comprehensively investigated and evaluated the feasibility and reliability of MemGuard. We show that using an incremental multiset hash function and a non-cryptographic hash function, the performance and energy overheads of Mem- Guard are negligible. We use the mathematical deduction and synthetic simulation to prove that MemGuard is robust and reliable.
Funder
Division of Computer and Network Systems
Publisher
Association for Computing Machinery (ACM)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献