Abstract
Several circulating biomarkers are reported to be associated with diabetic retinopathy (DR). However, their relative contributions to DR compared to known risk factors, such as hyperglycaemia, hypertension, and hyperlipidaemia, remain unclear. In this data driven study, we used novel models to evaluate the associations of over 400 laboratory parameters with DR compared to the established risk factors. Methods: we performed an environment-wide association study (EWAS) of laboratory parameters available in National Health and Nutrition Examination Survey (NHANES) 2007–2008 in individuals with diabetes with DR as the outcome (test set). We employed independent variable (feature) selection approaches, including parallelised univariate regression modelling, Principal Component Analysis (PCA), penalised regression, and RandomForest™. These models were replicated in NHANES 2005–2006 (replication set). Our test and replication sets consisted of 1025 and 637 individuals with available DR status and laboratory data respectively. Glycohemoglobin (HbA1c) was the strongest risk factor for DR. Our PCA-based approach produced a model that incorporated 18 principal components (PCs) that had an Area under the Curve (AUC) 0.796 (95% CI 0.761–0.832), while penalised regression identified a 9-feature model with 78.51% accuracy and AUC 0.74 (95% CI 0.72–0.77). RandomForest™ identified a 31-feature model with 78.4% accuracy and AUC 0.71 (95% CI 0.65–0.77). On grouping the selected variables in our RandomForest™, hyperglycaemia alone achieved AUC 0.72 (95% CI 0.68–0.76). The AUC increased to 0.84 (95% CI 0.78–0.9) when the model also included hypertension, hypercholesterolemia, haematocrit, renal, and liver function tests.
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献