Does the SORG Machine-learning Algorithm for Extremity Metastases Generalize to a Contemporary Cohort of Patients? Temporal Validation From 2016 to 2020-Reference-Cited by-同舟云学术

Does the SORG Machine-learning Algorithm for Extremity Metastases Generalize to a Contemporary Cohort of Patients? Temporal Validation From 2016 to 2020

Published:2023-05-25 Issue:12 Volume:481 Page:2419-2430
ISSN:0009-921X
Container-title:Clinical Orthopaedics & Related Research
language:en
Short-container-title:Clin Orthop Relat Res

Author:

de Groot Tom M.¹²,Ramsey Duncan³,Groot Olivier Q.¹,Fourman Mitchell¹^ORCID,Karhade Aditya V.¹,Twining Peter K.¹,Berner Emily A.¹,Fenn Brian P.¹,Collins Austin Keith¹^ORCID,Raskin Kevin¹,Lozano Santiago¹^ORCID,Newman Eric¹,Ferrone Marco⁴,Doornberg Job N.²,Schwab Joseph H.¹

Affiliation:

1. Massachusetts General Hospital, Boston, MA, USA

2. University Medical Center Groningen, Groningen, the Netherlands

3. University of Texas RGV School of Medicine, Edinburg, TX, USA

4. Brigham and Women’s Hospital, Boston, MA, USA

Abstract

Abstract Background The ability to predict survival accurately in patients with osseous metastatic disease of the extremities is vital for patient counseling and guiding surgical intervention. We, the Skeletal Oncology Research Group (SORG), previously developed a machine-learning algorithm (MLA) based on data from 1999 to 2016 to predict 90-day and 1-year survival of surgically treated patients with extremity bone metastasis. As treatment regimens for oncology patients continue to evolve, this SORG MLA-driven probability calculator requires temporal reassessment of its accuracy. Question/purpose Does the SORG-MLA accurately predict 90-day and 1-year survival in patients who receive surgical treatment for a metastatic long-bone lesion in a more recent cohort of patients treated between 2016 and 2020? Methods Between 2017 and 2021, we identified 674 patients 18 years and older through the ICD codes for secondary malignant neoplasm of bone and bone marrow and CPT codes for completed pathologic fractures or prophylactic treatment of an impending fracture. We excluded 40% (268 of 674) of patients, including 18% (118) who did not receive surgery; 11% (72) who had metastases in places other than the long bones of the extremities; 3% (23) who received treatment other than intramedullary nailing, endoprosthetic reconstruction, or dynamic hip screw; 3% (23) who underwent revision surgery, 3% (17) in whom there was no tumor, and 2% (15) who were lost to follow-up within 1 year. Temporal validation was performed using data on 406 patients treated surgically for bony metastatic disease of the extremities from 2016 to 2020 at the same two institutions where the MLA was developed. Variables used to predict survival in the SORG algorithm included perioperative laboratory values, tumor characteristics, and general demographics. To assess the models’ discrimination, we computed the c-statistic, commonly referred to as the area under the receiver operating characteristic (AUC) curve for binary classification. This value ranged from 0.5 (representing chance-level performance) to 1.0 (indicating excellent discrimination) Generally, an AUC of 0.75 is considered high enough for use in clinical practice. To evaluate the agreement between predicted and observed outcomes, a calibration plot was used, and the calibration slope and intercept were calculated. Perfect calibration would result in a slope of 1 and intercept of 0. For overall performance, the Brier score and null-model Brier score were determined. The Brier score can range from 0 (representing perfect prediction) to 1 (indicating the poorest prediction). Proper interpretation of the Brier score necessitates a comparison with the null-model Brier score, which represents the score for an algorithm that predicts a probability equal to the population prevalence of the outcome for each patient. Finally, a decision curve analysis was conducted to compare the potential net benefit of the algorithm with other decision-support methods, such as treating all or none of the patients. Overall, 90-day and 1-year mortality were lower in the temporal validation cohort than in the development cohort (90 day: 23% versus 28%; p < 0.001, and 1 year: 51% versus 59%; p<0.001). Results Overall survival of the patients in the validation cohort improved from 28% mortality at the 90-day timepoint in the cohort on which the model was trained to 23%, and 59% mortality at the 1-year timepoint to 51%. The AUC was 0.78 (95% CI 0.72 to 0.82) for 90-day survival and 0.75 (95% CI 0.70 to 0.79) for 1-year survival, indicating the model could distinguish the two outcomes reasonably. For the 90-day model, the calibration slope was 0.71 (95% CI 0.53 to 0.89), and the intercept was -0.66 (95% CI -0.94 to -0.39), suggesting the predicted risks were overly extreme, and that in general, the risk of the observed outcome was overestimated. For the 1-year model, the calibration slope was 0.73 (95% CI 0.56 to 0.91) and the intercept was -0.67 (95% CI -0.90 to -0.43). With respect to overall performance, the model’s Brier scores for the 90-day and 1-year models were 0.16 and 0.22. These scores were higher than the Brier scores of internal validation of the development study (0.13 and 0.14) models, indicating the models’ performance has declined over time. Conclusion The SORG MLA to predict survival after surgical treatment of extremity metastatic disease showed decreased performance on temporal validation. Moreover, in patients undergoing innovative immunotherapy, the possibility of mortality risk was overestimated in varying severity. Clinicians should be aware of this overestimation and discount the prediction of the SORG MLA according to their own experience with this patient population. Generally, these results show that temporal reassessment of these MLA-driven probability calculators is of paramount importance because the predictive performance may decline over time as treatment regimens evolve. The SORG-MLA is available as a freely accessible internet application at https://sorg-apps.shinyapps.io/extremitymetssurvival/. Level of Evidence Level III, prognostic study.

Publisher

Ovid Technologies (Wolters Kluwer Health)

Subject

Orthopedics and Sports Medicine,General Medicine,Surgery

Reference30 articles.

1. Machine learning for the orthopaedic surgeon: uses and limitations;Alsoof;J Bone Joint Surg Am,2022

2. The STROBE guidelines;Cuschieri;Saudi J Anaesth,2019

3. Understanding receiver operating characteristic (ROC) curves;Fan;CJEM,2006

4. Estimating survival in patients with operable skeletal metastases: an application of a Bayesian belief network;Forsberg;PLoS One,2011

5. Does artificial intelligence outperform natural intelligence in interpreting musculoskeletal radiological studies? A systematic review;Groot;Clin Orthop Relat Res,2020

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CORR Insights®: Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone;Clinical Orthopaedics & Related Research;2024-08-23

2. Artificial Intelligence in Detection, Management, and Prognosis of Bone Metastasis: A Systematic Review;Cancers;2024-07-29

3. Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone;Clinical Orthopaedics & Related Research;2024-07-23

4. Machine Learning–Assisted Decision Making in Orthopaedic Oncology;JBJS Reviews;2024-07

5. Erratum to: Does the SORG Machine-learning Algorithm for Extremity Metastases Generalize to a Contemporary Cohort of Patients? Temporal Validation From 2016 to 2020;Clinical Orthopaedics & Related Research;2024-05-22