Abstract
Objectives: To enhance the identification of individuals at risk of developing kidney stones, the use of machine learning (ML) algorithms has the potential by utilizing population and clinical information.
Methods: This cross-sectional study utilized data from the Fasa Adults Cohort Study (FACS) to comprehensively analyze the factors associated with symptomatic and clinically significant kidney stone disease. After data cleaning, 10,128 participants with 103 variables were included, with one outcome variable (presence of symptomatic kidney stones) and 102 predictor variables derived from questionnaires and laboratory tests. The study investigated kidney stone factors using five ML algorithms (SVM, RF, KNN, GBM, and XGB) and compared their performance. Additionally, data balancing was achieved using the SMOTE technique, and each algorithm’s accuracy, precision, sensitivity, specificity, F1 score, and area under the curve (AUC) were assessed.
Results: The XGB model demonstrated the best performance, with an AUC of 0.60, while RF, GBM, SVC, and KNN achieved AUC values of 0.58, 0.57, 0.54, and 0.52, respectively. The RF, GBM, and XGB models exhibited acceptable accuracy levels, with values of 0.81, 0.81, and 0.77, respectively. Moreover, we identified the top five predictors for kidney stone prediction as serum creatinine level, salt consumption, history of hospitalization, sleep duration, and BUN level.
Conclusions: ML models have significant potential in assessing an individual's risk of painful kidney stones development and guiding early lifestyle modifications to mitigate this risk. Continued research in this area can lead to improved predictive capabilities and personalized interventions for kidney stone disease management.