Explorations and Exploitation for Parity-based RAIDs with Ultra-fast SSDs-Reference-Cited by-同舟云学术

Explorations and Exploitation for Parity-based RAIDs with Ultra-fast SSDs

Published:2024-01-30 Issue:1 Volume:20 Page:1-32
ISSN:1553-3077
Container-title:ACM Transactions on Storage
language:en
Short-container-title:ACM Trans. Storage

Author:

Wang Shucheng¹^ORCID,Cao Qiang²^ORCID,Jiang Hong³^ORCID,Lu Ziyi²^ORCID,Yao Jie²^ORCID,Chen Yuxing⁴^ORCID,Pan Anqun⁴^ORCID

Affiliation:

1. China Mobile (Suzhou) Software Technology Co., Ltd., China; Huazhong University of Science and Technology, China

2. Huazhong University of Science and Technology, China

3. Department of Computer Science and Engineering, University of Texas at Arlington, USA

4. Tencent Inc., China

Abstract

Following a conventional design principle that pays more fast-CPU-cycles for fewer slow-I/Os, popular software storage architecture Linux Multiple-Disk (MD) for parity-based RAID (e.g., RAID5 and RAID6) assigns one or more centralized worker threads to efficiently process all user requests based on multi-stage asynchronous control and global data structures, successfully exploiting characteristics of slow devices, e.g., Hard Disk Drives (HDDs). However, we observe that, with high-performance NVMe-based Solid State Drives (SSDs), even the recently added multi-worker processing mode in MD achieves only limited performance gain because of the severe lock contentions under intensive write workloads. In this paper, we propose a novel stripe-threaded RAID architecture, StRAID, assigning a dedicated worker thread for each stripe-write (one-for-one model) to sufficiently exploit high parallelism inherent among RAID stripes, multi-core processors, and SSDs. For the notoriously performance-punishing partial-stripe writes that induce extra read and write I/Os, StRAID presents a two-stage stripe write mechanism and a two-dimensional multi-log SSD buffer. All writes first are opportunistically batched in memory, and then are written into the primary RAID for aggregated full-stripe writes or conditionally redirected to the buffer for partial-stripe writes. These buffered data are strategically reclaimed to the primary RAID. We evaluate a StRAID prototype with a variety of benchmarks and real-world traces. StRAID is demonstrated to outperform MD by up to 5.8 times in write throughput.

Funder

National Key Research and Development Program of China

NSFC

US National Science Foundation

Key Research and Development Project of Hubei

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3627992

Reference80 articles.

1. RACS

2. Marcos K. Aguilera, Nadav Amit, Irina Calciu, Xavier Deguillard, Jayneel Gandhi, Stanko Novakovic, Arun Ramanathan, Pratap Subrahmanyam, Lalith Suresh, Kiran Tati, Rajesh Venkatasubramanian, and Michael Wei. 2018. Remote regions: A simple abstraction for remote memory. In 2018 USENIX Annual Technical Conference, USENIX ATC 2018, Boston, MA, USA, July 11-13, 2018. USENIX Association, 775–787.

3. Differential RAID

4. Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, Peter Vajgel, et al. 2010. Finding a needle in haystack: Facebook’s photo storage. In OSDI, Vol. 10. 1–8.

5. Scaling a file system to many cores using an operation log