Affiliation:
1. Sun Yat-sen University, Guangzhou, China
2. Tencent, Guangzhou, China
Abstract
In large-scale online service systems, the occurrence of software changes is inevitable and frequent. Despite rigorous pre-deployment testing practices, the presence of defective software changes in the online environment cannot be completely eliminated. Consequently, there is a pressing need for automated techniques that can effectively identify these defective changes. However, the current abnormal change detection (ACD) approaches fall short in accurately pinpointing defective changes, primarily due to their disregard for the propagation of faults. To address the limitations of ACD, we propose a novel concept called root cause change analysis (RCCA) to identify the underlying root causes of change-inducing incidents. In order to apply the RCCA concept to practical scenarios, we have devised an intelligent RCCA framework named ChangeRCA. This framework aims to localize the defective change associated with change-inducing incidents among multiple changes. To assess the effectiveness of ChangeRCA, we have conducted an extensive evaluation utilizing a real-world dataset from WeChat and a simulated dataset encompassing 81 diverse defective changes. The evaluation results demonstrate that ChangeRCA outperforms the state-of-the-art ACD approaches, achieving an impressive Top-1 Hit Rate of 85% and significantly reducing the time required to identify defective changes.
Funder
National Natural Science Foundation of China
Guangdong Basic and Applied Basic Research Foundation
Publisher
Association for Computing Machinery (ACM)