Mean-Field Analysis of Coding Versus Replication in Large Data Storage Systems-Reference-Cited by-同舟云学术

Mean-Field Analysis of Coding Versus Replication in Large Data Storage Systems

Published:2018-02-24 Issue:1 Volume:3 Page:1-28
ISSN:2376-3639
Container-title:ACM Transactions on Modeling and Performance Evaluation of Computing Systems
language:en
Short-container-title:ACM Trans. Model. Perform. Eval. Comput. Syst.

Author:

Li Bin¹,Ramamoorthy Aditya²,Srikant R.³

Affiliation:

1. University of Rhode Island, Kingston

2. Iowa State University, Ames

3. University of Illinois at Urbana-Champaign

Abstract

We study cloud storage systems with a very large number of files stored in a very large number of servers. In such systems, files are either replicated or coded to ensure reliability, i.e., to guarantee file recovery from server failures. This redundancy in storage can further be exploited to improve system performance (mean file-access delay) through appropriate load-balancing (routing) schemes. However, it is unclear whether coding or replication is better from a system performance perspective since the corresponding queueing analysis of such systems is, in general, quite difficult except for the trivial case when the system load asymptotically tends to zero. Here, we study the more difficult case where the system load is not asymptotically zero. Using the fact that the system size is large, we obtain a mean-field limit for the steady-state distribution of the number of file access requests waiting at each server. We then use the mean-field limit to show that, for a given storage capacity per file, coding strictly outperforms replication at all traffic loads while improving reliability. Further, the factor by which the performance improves in the heavy traffic is at least as large as in the light-traffic case. Finally, we validate these results through extensive simulations.

Funder

NSF

DTRA

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture,Safety, Risk, Reliability and Quality,Media Technology,Information Systems,Software,Computer Science (miscellaneous)

Link

https://dl.acm.org/doi/pdf/10.1145/3159172

Reference33 articles.

1. Randomized load balancing with general service time distributions

2. Reducing Latency via Redundant Requests

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Reliability Evaluation of Erasure-coded Storage Systems with Latent Errors;ACM Transactions on Storage;2023-01-11

2. Tackling heterogeneous traffic in multi-access systems via erasure coded servers;Proceedings of the Twenty-Third International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing;2022-10-03

3. Latency Optimal Storage and Scheduling of Replicated Fragments for Memory Constrained Servers;IEEE Transactions on Information Theory;2022-06

4. Latency-Redundancy Tradeoff in Distributed Read-Write Systems;2022 14th International Conference on COMmunication Systems & NETworkS (COMSNETS);2022-01-04

5. Download time analysis for distributed storage systems with node failures;2021 IEEE International Symposium on Information Theory (ISIT);2021-07-12