Intelligent Fault-Tolerant Mechanism for Data Centers of Cloud Infrastructure-Reference-Cited by-同舟云学术

Intelligent Fault-Tolerant Mechanism for Data Centers of Cloud Infrastructure

Published:2022-02-08 Issue: Volume:2022 Page:1-12
ISSN:1563-5147
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Kumar T Satish¹^ORCID,H S Madhusudhan²^ORCID,Mustapha S. M. F. D. Syed³^ORCID,Gupta Punit⁴^ORCID,Tripathi Rajan Prasad⁵^ORCID

Affiliation:

1. Department of Computer Science & Engineering, BMS Institute of Technology & Management, Bengaluru, Karnataka, India

2. Department of Computer Science & Engineering, NIE Institute of Technology, Mysuru, Karnataka, India

3. College of Technological Innovation, Zayed University, Dubai, UAE

4. Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur, India

5. Department of Electronics and Communication, Amity University Tashkent, Tashkent, Uzbekistan

Abstract

Fault tolerance in cloud computing is considered as one of the most vital issues to deliver reliable services. Checkpoint/restart is one of the methods used to enhance the reliability of the cloud services. However, many existing methods do not focus on virtual machine (VM) failure that occurs due to the higher response time of a node, byzantine fault, and performance fault, and existing methods also ignore the optimization during the recovery phase. This paper proposes a checkpoint/restart mechanism to enhance reliability of cloud services. Our work is threefold: (1) we design an algorithm to identify virtual machine failure due to several faults; (2) an algorithm to optimize the checkpoint interval time is designed; (3) lastly, the asynchronous checkpoint/restart with log-based recovery mechanism is used to restart the failed tasks. The valuation results obtained using a real-time dataset shows that the proposed model reduces power consumption and improves the performance with a better fault tolerance solution compared to the nonoptimization method.

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2022/2379643.pdf

Reference20 articles.

1. Toward a smart cloud: a review of fault-tolerance methods in cloud systems;M. A Mukwevho;IEEE Transactions on Services Computing,2018

2. A survey on reliability in distributed systems

3. Using cloud-based resources to improve availability and reliability in a scientific workflow execution framework;S. Hernández

4. A survey of fault tolerance architecture in cloud computing

5. Fault tolerance-challenges, techniques and implementation in cloud computing;A. Bala;International Journal of Computer Science Issues (IJCSI),2012

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Fault‐tolerance approaches for distributed and cloud computing environments: A systematic review, taxonomy and future directions;Concurrency and Computation: Practice and Experience;2024-03-18

2. Dynamic Scalability Mechanisms for Microservices in Federated Cloud Platform;2023 IEEE 5th International Conference on Civil Aviation Safety and Information Technology (ICCASIT);2023-10-11

3. Intelligent Identification over Power Big Data: Opportunities, Solutions, and Challenges;Computer Modeling in Engineering & Sciences;2023

4. Gorilla Troops Optimizer Based Fault Tolerant Aware Scheduling Scheme for Cloud Environment;Intelligent Automation & Soft Computing;2023

5. High Availability Design of Avionics System Architecture Based on K3s;Lecture Notes in Electrical Engineering;2023