Author:
Haratian Arezoo,Maleki Zeinab,Shayegh Farzaneh,Safaeian Alireza
Abstract
AbstractDue to the increasing prevalence of chronic kidney disease and its high mortality rate, study of risk factors affecting the progression of the disease is of great importance. Here in this work, we aim to develop a framework for using machine learning methods to identify factors affecting kidney function. To this end classification methods are trained to predict the serum creatinine level based on numerical values of other blood test parameters in one of the three classes representing different ranges of the variable values. Models are trained using the data from blood test results of healthy and patient subjects including 46 different blood test parameters. The best developed models are random forest and LightGBM. Interpretation of the resulting model reveals a direct relationship between vitamin D and blood creatinine level. The detected analogy between these two parameters is reliable, regarding the relatively high predictive accuracy of the random forest model reaching the AUC of 0.90 and the accuracy of 0.74. Moreover, in this paper we develop a Bayesian network to infer the direct relationships between blood test parameters which have consistent results with the classification models. The proposed framework uses an inclusive set of advanced imputation methods to deal with the main challenge of working with electronic health data, missing values. Hence it can be applied to similar clinical studies to investigate and discover the relationships between the factors under study.
Publisher
Springer Science and Business Media LLC
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献