Abstract
Background
Short stature is a prevalent pediatric endocrine disorder where early detection and prediction are pivotal in improving treatment outcomes. However, existing diagnostic criteria often lack the necessary sensitivity and specificity due to the disorder's complex etiology. Hence, this study aims to employ machine learning (ML) techniques to develop an interpretable predictive model for short stature and to explore how growth environments influence its development.
Methods
We conducted a case-control study including 100 cases of short stature who were age-matched with 200 normal controls from the Endocrinology Department of Nanjing Children's Hospital from April to September 2021. Parental surveys were conducted to gather information on the children involved. We assessed 33 readily accessible medical characteristics and utilized conditional logistic regression to explore how growth environments influence the onset of short stature. Additionally, we evaluated the performance of nine ML algorithms to determine the optimal model. Subsequently, the Shapley Additive Explanation (SHAP) method was employed to prioritize feature importance and refine the final model.
Results
In multivariate logistic regression analysis, children's weight (OR = 0.85, 95% CI: 0.76, 0.96), maternal height (OR = 0.77, 95% CI: 0.68, 0.86), paternal height (OR = 0.80, 95% CI: 0.71, 0.91), maternal early puberty (OR = 0.02, 95% CI: 0.00, 0.39), and children's outdoor activity time exceeding 3 hours per day (OR = 0.01, 95% CI: 0.00, 0.68) were identified as protective factors for short stature. This study found that parental height, children's weight, and caregiver education significantly influenced the prediction of short stature risk, and the Random Forest (RF) model demonstrated the best discriminatory ability among 9 ML models.
Conclusions
This study indicates a close correlation between environmental growth factors and the occurrence of childhood short stature, particularly anthropometric characteristics. The Random Forest model performed exceptionally well, demonstrating its potential for clinical applications. These findings provide theoretical support for personalized interventions and preventive measures for short stature.