Affiliation:
1. Intuitive Surgical, Inc., USA
Abstract
Endoscopic video recordings are widely used in minimally invasive robot-assisted surgery, but when the endoscope is outside the patient’s body, it can capture irrelevant segments that may contain sensitive information. To address this, we propose a framework that accurately detects out-of-body frames in surgical videos by leveraging self-supervision with minimal data labels. We use a massive amount of unlabeled endoscopic images to learn meaningful representations in a self-supervised manner. Our approach, which involves pre-training on an auxiliary task and fine-tuning with limited supervision, outperforms previous methods for detecting out-of-body frames in surgical videos captured from da Vinci X and Xi surgical systems. The average F1 scores range from [Formula: see text] to [Formula: see text]. Remarkably, using only [Formula: see text] of the training labels, our approach still maintains an average F1 score performance above 97, outperforming fully-supervised methods with [Formula: see text] fewer labels. These results demonstrate the potential of our framework to facilitate the safe handling of surgical video recordings and enhance data privacy protection in minimally invasive surgery.
Publisher
World Scientific Pub Co Pte Ltd
Subject
Applied Mathematics,Artificial Intelligence,Computer Science Applications,Human-Computer Interaction,Biomedical Engineering
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献