Affiliation:
1. Division of AI Computer Science and Engineering, Kyonggi University, Suwon 16227, Republic of Korea
Abstract
Most of the existing sender-based message logging protocols cannot commonly handle simultaneous failures because, if both the sender and the receiver(s) of each message fail together, the receiver(s) cannot obtain the recovery information of the message. This unfortunate situation may happen due to their asymmetric logging behavior. This paper presents a novel sender-based message logging protocol for broadcast network based distributed systems to overcome the critical constraint of the previous ones with the following three features. First, when more than one process crashes at the same time, the protocol enables the system to ensure the always no rollback property by symmetrically replicating the recovery information at each process or group member connected on a network. Second, it can make the first feature persist even if the general form of communication for the system is a combination of point-to-point and group ones. Third, the communication overhead resulting from the replication can be highly lessened by making full use of the capability of the standard broadcast network in both communication modes. Experimental outcomes verify that, no matter which communication patterns are applied, it can reduce about 4.23∼9.96% of the total application execution time against the latest enabling the traditional ones to cope with simultaneous failures.
Subject
Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)
Reference24 articles.
1. Wang, X., Li, H., Sun, Q., Guo, C., Zhao, H., Wu, X., and Wang, A. (2022). The g-Good-Neighbor Conditional Diagnosability of Exchanged Crossed Cube under the MM* Model. Symmetry, 14.
2. Wang, S., Yao, Y., Zhu, F., Tang, W., and Xiao, Y. (2020). A Probabilistic Prediction Approach for Memory Resource of Complex System Simulation in Cloud Computing Environment. Symmetry, 12.
3. Checkpointing distributed computing systems: An optimisation approach;Mansouri;Int. J. High Perform. Comput. Appl.,2019
4. Chlebus, B.S., Kowalski, D.R., and Olkowski, J. (2022, January 25–29). Brief announcement: Deterministic consensus and checkpointing with crashes: Time and communication efficiency. Proceedings of the 2022 ACM Symposium on Principles of Distributed Computing, Salerno, Italy.
5. A survey of rollback-recovery protocols in message-passing systems;Elnozahy;ACM Comput. Surv.,2002