Affiliation:
1. Computer and Information Science Department, University of Massachusetts Dartmouth, North Dartmouth, MA 02747, USA
Abstract
Correctly measuring the reliability and availability of a cloud-based system is critical for evaluating its system performance. Due to the promised high reliability of physical facilities provided for cloud services, software faults have become one of the major factors for the failures of cloud-based systems. In this paper, we focus on the software aging phenomenon where system performance may be progressively degraded due to exhaustion of system resources, fragmentation and accumulation of errors. We use a proactive technique, called software rejuvenation, to counteract the software aging problem. The dynamic fault tree (DFT) formalism is adopted to model the system reliability before and during a software rejuvenation process in an aging cloud-based system. A novel analytical approach is presented to derive the reliability function of a cloud-based Hot SPare (HSP) gate, which is further verified using Continuous Time Markov Chains (CTMC) for its correctness. We use a case study of a cloud-based system to illustrate the validity of our approach. Based on the reliability analytical results, we show how cost-effective software rejuvenation schedules can be created to keep the system reliability consistently staying above a predefined critical level.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Software
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献