Fault-tolerant wait-free shared objects-Reference-Cited by-同舟云学术

Fault-tolerant wait-free shared objects

Published:1998-05 Issue:3 Volume:45 Page:451-500
ISSN:0004-5411
Container-title:Journal of the ACM
language:en
Short-container-title:J. ACM

Author:

Jayanti Prasad¹,Chandra Tushar Deepak²,Toueg Sam³

Affiliation:

1. Dartmouth College, Honover, NH

2. IBM T. J. Watson Research Center, Hawthorne, NY

3. Cornell Univ., Ithaca, NY

Abstract

Wait-free implementations of shared objects tolerate the failure of processes, but not the failure of base objects from which they are implemented. We consider the problem of implementing shared objects that tolerate the failure of both processes and base objects. We identify two classes of object failures: responsive and nonresponsive . With responsive failures, a faulty object responds to every operation, but its responses may be incorrect. With nonresponsive failures, a faulty object may also “hang” without responding. In each class, we define crash, omission, and arbitrary modes of failure. We show that all responsive failure modes can be tolerated. More precisely, for all responsive failure modes ℱ, object types T , and t ≥ 0, we show how to implement a shared object of type T which is t -tolerant for ℱ. Such an object remains correct and wait-free even if up to t base objects fail according to ℱ. In contrast to responsive failures, we show that even the most benign non-responsive failure mode cannot be tolerated. We also show that randomization can be used to circumvent this impossibility result. Graceful degradation is a desirable property of fault-tolerant implementations: the implemented object never fails more severely than the base objects it is derived from, even if all the base objects fail. For several failure modes, we show wheter this property can be achieved, and, if so, how.

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software

Link

https://dl.acm.org/doi/pdf/10.1145/278298.278305

Reference35 articles.

1. Computing with faulty shared objects

Cited by 49 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Prior Work;Recoverable Mutual Exclusion;2023

2. Brief Announcement: Towards a Theory of Wear Leveling in Persistent Data Structures;Proceedings of the 2022 ACM Symposium on Principles of Distributed Computing;2022-07-20

3. On atomic registers and randomized consensus in M&M systems;Distributed Computing;2021-10-27

4. Functional Faults;Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures;2020-07-06

5. Functional faults;Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming;2020-02-19