Higher reliability redundant disk arrays-Reference-Cited by-同舟云学术

Higher reliability redundant disk arrays

Published:2009-11 Issue:3 Volume:5 Page:1-59
ISSN:1553-3077
Container-title:ACM Transactions on Storage
language:en
Short-container-title:ACM Trans. Storage

Author:

Thomasian Alexander¹,Blaum Mario²

Affiliation:

1. Thomasian and Associates, Pleasantville, NY

2. Universidad Complutense de Madrid (UCM), Madrid, Spain

Abstract

Parity is a popular form of data protection in redundant arrays of inexpensive/independent disks (RAID) . RAID5 dedicates one out of N disks to parity to mask single disk failures, that is, the contents of a block on a failed disk can be reconstructed by exclusive-ORing the corresponding blocks on surviving disks. RAID5 can mask a single disk failure, and it is vulnerable to data loss if a second disk failure occurs. The RAID5 rebuild process systematically reconstructs the contents of a failed disk on a spare disk, returning the system to its original state, but the rebuild process may be unsuccessful due to unreadable sectors. This has led to two disk failure tolerant arrays (2DFTs) , such as RAID6 based on Reed-Solomon (RS) codes. EVENODD, RDP (Row-Diagonal-Parity), the X-code, and RM2 (Row-Matrix) are 2DFTs with parity coding. RM2 incurs a higher level of redundancy than two disks, while the X-code is limited to a prime number of disks. RDP is optimal with respect to the number of XOR operations at the encoding, but not for short write operations. For small symbol sizes EVENODD and RDP have the same disk access pattern as RAID6, while RM2 and the X-code incur a high recovery cost with two failed disks. We describe variations to RAID5 and RAID6 organizations, including clustered RAID, different methods to update parities, rebuild processing, disk scrubbing to eliminate sector errors, and the intra-disk redundancy (IDR) method to deal with sector errors. We summarize the results of recent studies of failures in hard disk drives. We describe Markov chain reliability models to estimate RAID mean time to data loss (MTTDL) taking into account sector errors and the effect of disk scrubbing. Numerical results show that RAID5 plus IDR attains the same MTTDL level as RAID6, while incurring a lower performance penalty. We conclude with a survey of analytic and simulation studies of RAID performance and tools and benchmarks for RAID performance evaluation.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/1629075.1629076

Reference192 articles.

1. Tolerating multiple failures in RAID architectures with optimal storage and uniform declustering

2. Minerva

3. Quickly finding near-optimal storage designs

Cited by 41 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Elastic RAID: Implementing RAID over SSDs with Built-in Transparent Compression;Proceedings of the 16th ACM International Conference on Systems and Storage;2023-06-05

2. Reliability Evaluation of Erasure-coded Storage Systems with Latent Errors;ACM Transactions on Storage;2023-01-11

3. Bibliography;Storage Systems;2022

4. Heterogeneous Disk Arrays - HDAs;Storage Systems;2022

5. Coding for multiple disk failures;Storage Systems;2022