Affiliation:
1. Laboratory of Medical Physics Medical School National and Kapodistrian University of Athens Athens Greece
2. 1st Department of Surgery Laikon General Hospital Medical School National and Kapodistrian University of Athens Athens Greece
Abstract
AbstractBackgroundReal‐time prediction of the remaining surgery duration (RSD) is important for optimal scheduling of resources in the operating room.MethodsWe focus on the intraoperative prediction of RSD from laparoscopic video. An extensive evaluation of seven common deep learning models, a proposed one based on the Transformer architecture (TransLocal) and four baseline approaches, is presented. The proposed pipeline includes a CNN‐LSTM for feature extraction from salient regions within short video segments and a Transformer with local attention mechanisms.ResultsUsing the Cholec80 dataset, TransLocal yielded the best performance (mean absolute error (MAE) = 7.1 min). For long and short surgeries, the MAE was 10.6 and 4.4 min, respectively. Thirty minutes before the end of surgery MAE = 6.2 min, 7.2 and 5.5 min for all long and short surgeries, respectively.ConclusionsThe proposed technique achieves state‐of‐the‐art results. In the future, we aim to incorporate intraoperative indicators and pre‐operative data.