Multitier Web System Reliability: Identifying Causative Metrics and Analyzing Performance Anomaly Using a Regression Model
Author:
Kim Sundeuk12ORCID, Kim Jong Seon3, In Hoh Peter1ORCID
Affiliation:
1. Department of Computer Science, Korea University, Seoul 02841, Republic of Korea 2. Platform Planning Group, Samsung SDS, Seoul 05510, Republic of Korea 3. MSP Development Group, Samsung SDS, Seoul 05510, Republic of Korea
Abstract
With the development of the Internet and communication technologies, the types of services provided by multitier Web systems are becoming more diverse and complex compared to those of the past. Ensuring a continuous availability of business services is crucial for multitier Web system providers, as service performance issues immediately affect customer experience and satisfaction. Large companies attempt to monitor the system performance indicator (SPI) that summarizes the status of multitier Web systems to detect performance anomalies at an early stage. However, the current anomaly detection methods are designed to monitor a single specific SPI. Moreover, the existing approaches consider performance anomaly detection and its root cause analysis separately, thereby aggravating the burden of resolving the performance anomaly. To support the system provider in diagnosing the performance anomaly, we propose an advanced causative metric analysis (ACMA) framework. First, we draw out 191 performance metrics (PMs) closely related to the target SPI. Among these PMs, the ACMA determines 62 vital PMs that have the most influence on the variance of the target SPI using several statistical methods. Then, we implement a performance anomaly detection model to identify the causative metrics (CMs) between the vital PMs using random forest regression. Even if the target SPI changes, our detection model does not require any change in its model structure and can derive closely related PMs of the target SPI. Based on our experiments, wherein we applied the ACMA to the business services in an enterprise system, we observed that the proposed ACMA could correctly detect various performance anomalies and their CMs.
Funder
Blockchain Research Institute(BRI) of Korea University
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference49 articles.
1. Performance anomaly detection and bottleneck identification;Ibidunmoye;ACM Comput. Surv.,2015 2. Ren, H., Xu, B., Wang, Y., Yi, C., Huang, C., Kou, X., Xing, T., Yang, M., Tong, J., and Zhang, Q. (2019, January 4–8). Time-series anomaly detection service at microsoft. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA. 3. Laptev, N., Amizadeh, S., and Flint, I. (2015, January 10–13). Generic and scalable framework for automated time-series anomaly detection. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia. 4. Peiris, M., Hill, J.H., Thelin, J., Bykov, S., Kliot, G., and Konig, C. (July, January 27). Pad: Performance anomaly detection in multi-server distributed systems. Proceedings of the 2014 IEEE 7th International Conference on Cloud Computing, Anchorage, AK, USA. 5. (2022, June 09). Jennifersoft Inc. 2021. Available online: https://jennifersoft.com/en/.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|