Author:
Marchese Aizenman Gabriel,Salvagnini Pietro,Cherubini Andrea,Biffi Carlo
Abstract
BackgroundEnsuring accurate polyp detection during colonoscopy is essential for preventing colorectal cancer (CRC). Recent advances in deep learning-based computer-aided detection (CADe) systems have shown promise in enhancing endoscopists’ performances. Effective CADe systems must achieve high polyp detection rates from the initial seconds of polyp appearance while maintaining low false positive (FP) detection rates throughout the procedure.MethodWe integrated four open-access datasets into a unified platform containing over 340,000 images from various centers, including 380 annotated polyps, with distinct data splits for comprehensive model development and benchmarking. The REAL-Colon dataset, comprising 60 full-procedure colonoscopy videos from six centers, is used as the fifth dataset of the platform to simulate clinical conditions for model evaluation on unseen center data. Performance assessment includes traditional object detection metrics and new metrics that better meet clinical needs. Specifically, by defining detection events as sequences of consecutive detections, we compute per-polyp recall at early detection stages and average per-patient FPs, enabling the generation of Free-Response Receiver Operating Characteristic (FROC) curves.ResultsUsing YOLOv7, we trained and tested several models across the proposed data splits, showcasing the robustness of our open-access platform for CADe system development and benchmarking. The introduction of new metrics allows for the optimization of CADe operational parameters based on clinically relevant criteria, such as per-patient FPs and early polyp detection. Our findings also reveal that omitting full-procedure videos leads to non-realistic assessments and that detecting small polyp bounding boxes poses the greatest challenge.ConclusionThis study demonstrates how newly available open-access data supports ongoing research progress in environments that closely mimic clinical settings. The introduced metrics and FROC curves illustrate CADe clinical efficacy and can aid in tuning CADe hyperparameters.