Explainable Machine Learning for Prediction of Procedural Case Durations Developed Using a Large Multicenter Database: Algorithm Development and Validation (Preprint)-Reference-Cited by-同舟云学术

Explainable Machine Learning for Prediction of Procedural Case Durations Developed Using a Large Multicenter Database: Algorithm Development and Validation (Preprint)

Published:2022-12-09 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Kendale Samir^ORCID,Bishara Andrew,Burns Michael^ORCID,Solomon Stuart,Corriere Matthew,Mathis Michael^ORCID

Abstract

BACKGROUND

Accurate projections of procedural case durations are complex, but critical to planning of perioperative staffing, operating room resources, and patient communication. Nonlinear prediction models using machine learning methods may provide opportunities for hospitals to improve upon current estimates of procedure duration.

OBJECTIVE

We hypothesized a machine learning algorithm derived from a large multicenter dataset would more accurately predict surgical procedure duration when compared to a baseline linear regression approach. Using an explainable machine learning-based algorithm, results provide additional valuable insight regarding procedure duration and variability.

METHODS

A total of 1,177,893 procedures from 13 academic and private hospitals between 2016 and 2019 were used. Deep learning, gradient boosting, and ensemble machine learning models were generated using perioperative data available at three distinct time points: time of scheduling, time of arrival to the operating/procedure room (primary model), and time of surgical incision/procedure start. The primary outcome was procedure duration, defined by the time between arrival and departure of the patient from the procedure room. Model performance was assessed by mean absolute error, proportion of predictions within 20% of actual duration, and other standard metrics. Performance was compared to a baseline method of historical means within a linear regression model. Model features driving predictions were assessed using Shapley values and permutation feature importance.

RESULTS

Across all procedures, median procedure duration was 94 minutes (interquartile range of 50-167 minutes). In estimating procedure duration, the gradient boosting machine was the best performing model, demonstrating a mean absolute error of 34 minutes with 46% of predictions within 20% of actual duration in the test dataset. This represented a statistically and clinically significant improvement in predictions compared to a baseline linear regression model (43 minutes, p < 0.001; 39% of predictions within 20% of actual duration). The most important features in model training were historical procedure duration by surgeon, the word “free” within the procedure text, and time of day.

CONCLUSIONS

Nonlinear models using machine learning techniques may be used to generate high-performing, automatable, explainable, and scalable prediction models for procedure duration. Medi

Publisher

JMIR Publications Inc.

Reference36 articles.

1. Variability in Case Durations for Common Surgical Procedures

2. Optimizing Operating Room Scheduling

3. Use of Historical Surgical Times to Predict Duration of Primary Total Knee Arthroplasty

4. Value of a Scheduled Duration Quantified in Terms of Equivalent Numbers of Historical Cases

5. Improving the Prediction of Total Surgical Procedure Time Using Linear Regression Modeling