Affiliation:
1. Division of Research and Optimal Patient Care, American College of Surgeons, Chicago, IL
2. Department of Surgery, Washington University in St. Louis, Center for Health Policy and the Olin Business School at Washington University in St Louis, John Cochran Veterans Affairs Medical Center; and BJC Healthcare, St. Louis, MO
3. Department of Surgery, University of California Los Angeles David Geffen School of Medicine and the VA Greater Los Angeles Healthcare System, Los Angeles.
Abstract
Objective:
To compare the performance of the ACS NSQIP “universal” risk calculator (N-RC) to operation-specific RCs.
Background:
Resources have been directed toward building operation-specific RCs because of an implicit belief that they would provide more accurate risk estimates than the N-RC. However, operation-specific calculators may not provide sufficient improvements in accuracy to justify the costs in development, maintenance, and access.
Methods:
For the N-RC, a cohort of 5,020,713 NSQIP patient records were randomly divided into 80% for machine learning algorithm training and 20% for validation. Operation-specific risk calculators (OS-RC) and OS-RCs with operation-specific predictors (OSP-RC) were independently developed for each of 6 operative groups (colectomy, whipple pancreatectomy, thyroidectomy, abdominal aortic aneurysm (open), hysterectomy/myomectomy, and total knee arthroplasty) and 14 outcomes using the same 80%/20% rule applied to the appropriate subsets of the 5M records. Predictive accuracy was evaluated using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), and Hosmer-Lemeshow (H-L) P values, for 13 binary outcomes, and mean squared error for the length of stay outcome.
Results:
The N-RC was found to have greater AUROC (P = 0.002) and greater AUPRC (P < 0.001) compared to the OS-RC. No other statistically significant differences in accuracy, across the 3 risk calculator types, were found. There was an inverse relationship between the operation group sample size and magnitude of the difference in AUROC (r = −0.278; P = 0.014) and in AUPRC (r = −0.425; P < 0.001) between N-RC and OS-RC. The smaller the sample size, the greater the superiority of the N-RC.
Conclusions:
While operation-specific RCs might be assumed to have advantages over a universal RC, their reliance on smaller datasets may reduce their ability to accurately estimate predictor effects. In the present study, this tradeoff between operation specificity and accuracy, in estimating the effects of predictor variables, favors the N-R, though the clinical impact is likely to be negligible.
Publisher
Ovid Technologies (Wolters Kluwer Health)
Subject
Pharmacology (medical),Complementary and alternative medicine,Pharmaceutical Science
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献