Interpretable machine learning for predicting chronic kidney disease progression risk

Author:

Zheng Jin-Xin1ORCID,Li Xin1,Zhu Jiang2ORCID,Guan Shi-Yang3,Zhang Shun-Xian45,Wang Wei-Ming1

Affiliation:

1. Department of Nephrology, Ruijin Hospital, Institute of Nephrology, Shanghai Jiao Tong University School of Medicine, Shanghai, China

2. Liver Transplantation Center, West China Hospital, Sichuan University, Chengdu, China

3. Department of Statistics, Second Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China

4. School of Global Health, Chinese Center for Tropical Diseases Research – Shanghai Jiao Tong University School of Medicine, Shanghai, China

5. Clinical Research Center, Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China

Abstract

Objective Chronic kidney disease (CKD) poses a major global health burden. Early CKD risk prediction enables timely interventions, but conventional models have limited accuracy. Machine learning (ML) enhances prediction, but interpretability is needed to support clinical usage with both in diagnostic and decision-making. Methods A cohort of 491 patients with clinical data was collected for this study. The dataset was randomly split into an 80% training set and a 20% testing set. To achieve the first objective, we developed four ML algorithms (logistic regression, random forests, neural networks, and eXtreme Gradient Boosting (XGBoost)) to classify patients into two classes—those who progressed to CKD stages 3–5 during follow-up (positive class) and those who did not (negative class). For the classification task, the area under the receiver operating characteristic curve (AUC-ROC) was used to evaluate model performance in discriminating between the two classes. For survival analysis, Cox proportional hazards regression (COX) and random survival forests (RSFs) were employed to predict CKD progression, and the concordance index (C-index) and integrated Brier score were used for model evaluation. Furthermore, variable importance, partial dependence plots, and restrict cubic splines were used to interpret the models’ results. Results XGBOOST demonstrated the best predictive performance for CKD progression in the classification task, with an AUC-ROC of 0.867 (95% confidence interval (CI): 0.728–0.100), outperforming the other ML algorithms. In survival analysis, RSF showed slightly better discrimination and calibration on the test set compared to COX, indicating better generalization to new data. Variable importance analysis identified estimated glomerular filtration rate, age, and creatinine as the most important predictors for CKD survival analysis. Further analysis revealed non-linear associations between age and CKD progression, suggesting higher risks in patients aged 52–55 and 65–66 years. The association between cholesterol levels and CKD progression was also non-linear, with lower risks observed when cholesterol levels were in the range of 5.8–6.4 mmol/L. Conclusions Our study demonstrated the effectiveness of interpretable ML models for predicting CKD progression. The comparison between COX and RSF highlighted the advantages of ML in survival analysis, particularly in handling non-linearity and high-dimensional data. By leveraging interpretable ML for unraveling risk factor relationships, contrasting predictive techniques, and exposing non-linear associations, this study significantly advances CKD risk prediction to enable enhanced clinical decision-making.

Funder

National Key Research and Development Program of China

Natural Science Foundation of Shanghai Municipality

Three-year Action Plan for Promoting Clinical Skills and Innovation Ability of Municipal Hospitals

China National Science Foundation

Publisher

SAGE Publications

Subject

Health Information Management,Computer Science Applications,Health Informatics,Health Policy

Reference41 articles.

1. Sustainable Development Goals relevant to kidney health: an update on progress

2. Chronic Kidney Disease

3. Burden of Kidney Diseases - PAHO/WHO|Pan American Health Organization [Internet]. [accessed on 2023 Aug 7]. Available from: https://www.paho.org/en/enlace/burden-kidney-diseases

4. A Risk Score for Chronic Kidney Disease in the General Population

5. Chronic Kidney Disease Diagnosis and Management

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3