Using a multi-staged strategy based on machine learning and mathematical modeling to predict genotype-phenotype risk patterns in diabetic kidney disease: a prospective case–control cohort analysis-Reference-Cited by-同舟云学术

Using a multi-staged strategy based on machine learning and mathematical modeling to predict genotype-phenotype risk patterns in diabetic kidney disease: a prospective case–control cohort analysis

Published:2013-07-23 Issue:1 Volume:14 Page:
ISSN:1471-2369
Container-title:BMC Nephrology
language:en
Short-container-title:BMC Nephrol

Author:

Leung Ross KK,Wang Ying,Ma Ronald CW,Luk Andrea OY,Lam Vincent,Ng Maggie,So Wing Yee,Tsui Stephen KW,Chan Juliana CN

Abstract

Abstract Background Multi-causality and heterogeneity of phenotypes and genotypes characterize complex diseases. In a database with comprehensive collection of phenotypes and genotypes, we compared the performance of common machine learning methods to generate mathematical models to predict diabetic kidney disease (DKD). Methods In a prospective cohort of type 2 diabetic patients, we selected 119 subjects with DKD and 554 without DKD at enrolment and after a median follow-up period of 7.8 years for model training, testing and validation using seven machine learning methods (partial least square regression, the classification and regression tree, the C5.0 decision tree, random forest, naïve Bayes classification, neural network and support vector machine). We used 17 clinical attributes and 70 single nucleotide polymorphisms (SNPs) of 54 candidate genes to build different models. The top attributes selected by the best-performing models were then used to build models with performance comparable to those using the entire dataset. Results Age, age of diagnosis, systolic blood pressure and genetic polymorphisms of uteroglobin and lipid metabolism were selected by most methods. Models generated by support vector machine (svmRadial) and random forest (cforest) had the best prediction accuracy whereas models derived from naïve Bayes classifier and partial least squares regression had the least optimal performance. Using 10 clinical attributes (systolic and diastolic blood pressure, age, age of diagnosis, triglyceride, white blood cell count, total cholesterol, waist to hip ratio, LDL cholesterol, and alcohol intake) and 5 genetic attributes (UGB G38A, LIPC -514C > T, APOB Thr71Ile, APOC3 3206T > G and APOC3 1100C > T), selected most often by SVM and cforest, we were able to build high-performance models. Conclusions Amongst different machine learning methods, svmRadial and cforest had the best performance. Genetic polymorphisms related to inflammation and lipid metabolism warrant further investigation for their associations with DKD.

Publisher

Springer Science and Business Media LLC

Subject

Nephrology

Link

http://link.springer.com/content/pdf/10.1186/1471-2369-14-162.pdf

Reference28 articles.

1. Luk AO, So WY, Ma RC, Kong AP, Ozaki R, Ng VS, Yu LW, Lau WW, Yang X, Chow FC, Chan JC, Tong PC: Metabolic syndrome predicts new onset of chronic kidney disease in 5,829 patients with type 2 diabetes: a 5-year prospective analysis of the Hong Kong Diabetes Registry. Diabetes Care. 2008, 31: 2357-2361. 10.2337/dc08-0971.

2. Freedman BI, Bostrom M, Daeihagh P, Bowden DW: Genetic factors in diabetic nephropathy. Clin J Am Soc Nephrol. 2007, 2: 1306-1316. 10.2215/CJN.02560607.

3. Liu Y, Freedman BI: Genetics of progressive renal failure in diabetic kidney disease. Kidney Int Suppl. 2005, 99: S94-S97.

4. Schork NJ, Murray SS, Frazer KA, Topol EJ: Common vs. rare allele hypotheses for complex diseases. Curr Opin Genet Dev. 2009, 19: 212-219. 10.1016/j.gde.2009.04.010.

5. Yang Q, Khoury MJ, Friedman JM, Little J, Flanders WD: How many genes underlie the occurrence of common complex diseases in the population?. Int J Epidemiol. 2005, 34: 1129-1137. 10.1093/ije/dyi130.

Cited by 52 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Analyzing students' academic performance using educational data mining;Computers and Education: Artificial Intelligence;2024-12

2. A non-linear association between low-density lipoprotein cholesterol and the risk of diabetic kidney disease in patients with type 2 diabetes in China;Preventive Medicine Reports;2024-09

3. From bytes to nephrons: AI’s journey in diabetic kidney disease;Journal of Nephrology;2024-08-12

4. Machine and deep learning techniques for the prediction of diabetics: a review;Multimedia Tools and Applications;2024-07-16

5. Machine Learning-Based Predictive Modeling of Diabetic Nephropathy in Type 2 Diabetes Using Integrated Biomarkers: A Single-Center Retrospective Study;Diabetes, Metabolic Syndrome and Obesity;2024-05