Hi-RCA: A Hierarchy Anomaly Diagnosis Framework Based on Causality and Correlation Analysis
-
Published:2023-11-08
Issue:22
Volume:13
Page:12126
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Yang Jingjing1, Guo Yuchun1, Chen Yishuai1, Zhao Yongxiang1
Affiliation:
1. School of Electronic Information and Engineering, Beijing Jiaotong University, Beijing 100044, China
Abstract
Microservice architecture has been widely adopted by large-scale applications. Due to the huge amount of data and complex microservice dependency, it also poses new challenges in ensuring reliable performance and maintenance. Existing approaches still suffer from limitations of anomaly data, over-simplification of metric relationships, and lack of diagnosing interpretability. To solve these issues, this paper builds a hierarchy root cause diagnosis framework, named Hi-RCA. We propose a global perspective to characterize different abnormal symptoms, which focuses on changes in metrics’ causation and correlation. We decompose the diagnosis task into two phases: anomalous microservice location and anomalous reason diagnosis. In the first phase, we use Kalman filtering to quantify microservice abnormality based on the estimation error. In the second phase, we use causation analysis to identify anomalous metrics and generate anomaly knowledge graphs; by correlation analysis, we construct an anomaly propagation graph and explain the anomaly symptoms via graph comparison. Our experimental evaluation on an open dataset shows that Hi-RCA can effectively locate root causes with 90% mean average precision, outperforming state-of-the-art methods.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference50 articles.
1. Butzin, B., Golatowski, F., and Timmermann, D. (2016, January 6–9). Microservices approach for the internet of things. Proceedings of the IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), Berlin, Germany. 2. Di Francesco, P., Malavolta, I., and Lago, P. (2017, January 3–7). Research on architecting microservices: Trends, focus, and potential for industrial adoption. Proceedings of the IEEE International Conference on Software Architecture (ICSA), Gothenburg, Sweden. 3. Newman, S. (2021). Building Microservices, O’Reilly Media, Inc. 4. Wang, P., Xu, J., Ma, M., Lin, W., Pan, D., Wang, Y., and Chen, P. (2018, January 1–4). Cloudranger: Root cause identification for cloud native systems. Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Washington, DC, USA. 5. Ma, M., Lin, W., Pan, D., and Wang, P. (2019, January 8–13). Ms-rank: Multi-metric and self-adaptive root cause diagnosis for microservice applications. Proceedings of the IEEE International Conference on Web Services (ICWS), Milan, Italy.
|
|