Interpretable Machine Learning Model for Predicting and Risk Assessment of Diabetic Nephropathy (Preprint)

Author:

Wen YiliORCID,Wan ZhiqiangORCID,Ren Huiling,Wang Xu,Wang Weijie

Abstract

UNSTRUCTURED

Introduction: Diabetic Nephropathy (DN), severe complications of diabetes, is characterized by proteinuria, hypertension, and progressive renal function decline, potentially leading to end-stage renal disease (ESRD). DN's pathogenesis involves high glucose levels, oxidative stress, inflammation, and fibrosis, resulting in kidney changes such as glomerular basement membrane thickening and glomerulosclerosis. The International Diabetes Federation projects that by 2045, 783 million people will have diabetes, with 30%-40% of them developing DN. Early detection and intervention are crucial for preserving renal function, improving quality of life, eliminating cardiovascular complications, and reducing healthcare costs. Methods: This study utilized machine learning (ML) techniques to develop and validate a predictive model for DN, focusing on both high predictive accuracy and model interpretability. Data from 1,000 Type-2 diabetes patients, including 444 with DN and 556 without, were analyzed. Various ML algorithms, including decision trees, random forests, Extra Trees, AdaBoost, XGBoost, and LightGBM, were employed. Multiple imputation was used for handling missing data, and the Synthetic Minority Over-sampling Technique (SMOTE) addressed data imbalance. Model performance was evaluated with metrics such as accuracy, precision, recall, F1 score, specificity, and area under the curve (AUC). Explainable Machine Learning (XML) techniques like LIME and SHAP were used to enhance model transparency and interpretability. Results: XGBoost and LightGBM demonstrated superior performance, with XGBoost achieving the highest accuracy of 86.87%, a precision of 88.90%, a recall of 84.40%, an F1 score of 86.44%, and a specificity of 89.12%. LIME and SHAP analyses provided insights into the contribution of individual features to the prediction outcomes, identifying serum creatinine, C-peptide, albumin, and lipoproteins as significant predictors. Conclusion: The developed ML model not only provides a robust predictive tool for early diagnosis and risk assessment of DN but also ensures transparency and interpretability, crucial for clinical integration. By enabling early intervention and personalized treatment strategies, this model has the potential to improve patient outcomes and optimize healthcare resource utilization.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3