Abstract
SummaryBackgroundHip fracture is associated with immobility, morbidity, mortality, and high medical cost. Due to limited availability of dual-energy X-ray absorptiometry (DXA), hip fracture prediction models without using bone mineral density (BMD) data are essential. We aimed to develop and validate 10-year sex-specific hip fracture prediction models using electronic health records (EHR) without BMD.MethodsIn this population-based study, the derivation cohort comprised 161,051 public healthcare service users (91,926 female; 69,125 male) in Hong Kong aged≥60. Sex-stratified derivation cohort was randomly split to 80% training and 20% internal testing datasets. An external validation cohort comprised 3,046 community-dwelling participants. With 395 potential predictors (age, diagnosis and drug prescription records from EHR), 10-year sex-specific hip fracture prediction models were developed using stepwise selection by logistic regression (LR) and four machine learning (ML) algorithms (gradient boosting machine, random forest, eXtreme gradient boosting, and single-layer neural networks) in the training cohort. Model performance was evaluated in both internal and external validation cohorts.FindingsIn female, the LR model had the highest AUC (0.815) and adequate calibration in internal validation. Reclassification metrics showed ML algorithms could not further improve the performance of the LR model. Similar performance was attained by the LR model in external validation, with high AUC (0.841) comparable to other ML algorithms. In internal validation for male, LR model had high AUC (0.818) and it outperformed all ML models as indicated by reclassification metrics, with adequate calibration. In external validation, the LR model had high AUC (0.898) comparable to ML algorithms. Reclassification metrics demonstrated that LR model had the best discrimination performance.InterpretationEven without using BMD data, the 10-year hip fracture prediction models developed by conventional LR had better discrimination performance than the models developed by ML algorithms. Upon further validation in independent cohorts, the LR models could be integrated into the routine clinical workflow, aiding the identification of people at high risk for DXA scan.FundingThis study was funded by the Health and Medical Research Fund, Food and Health Bureau, Hong Kong SAR Government (reference: 17181381).
Publisher
Cold Spring Harbor Laboratory