Affiliation:
1. Department of Radiation Oncology Massachusetts General Hospital and Harvard Medical School Boston Massachusetts USA
Abstract
AbstractBackgroundAbsorbable hydrogel spacer injected between prostate and rectum is gaining popularity for rectal sparing. The spacer alters patient anatomy and thus requires new auto‐contouring models.PurposeTo report the development and comprehensive evaluation of two deep‐learning models for patients injected with a radio‐transparent (model I) versus radiopaque (model II) spacer.Methods and materialsModel I was trained and cross‐validated by 135 cases with transparent spacer and tested on 24 cases. Using refined training methods, model II was trained and cross‐validated by the same dataset, but with the Hounsfield Unit distribution in the spacer overridden by that obtained from ten cases with opaque spacer. Model II was tested on 64 cases. The models auto‐contour eight regions of interest (ROIs): spacer, prostate, proximal seminal vesicles (SVs), left and right femurs, bladder, rectum, and penile bulb. Qualitatively, each auto contour (AC), as well as the composite set, was assessed against manual contour (MC), by a radiation oncologist using a 1 (accepted directly or after minor editing), 2 (accepted after moderate editing), 3 (accepted after major editing), and 4 (rejected) scoring scale. The efficiency gain was characterized by the mean score as nearly complete [1–1.75], substantial (1.75–2.5], meaningful (2.5–3.25], and no (3.25–4.00]. Quantitatively, the geometric similarity between AC and MC was evaluated by dice similarity coefficient (DSC) and mean distance to agreement (MDA), using tolerance recommended by AAPM TG‐132 Report. The results by the two models were compared to examine the outcome of the refined training methods. The large number of testing cases for model II allowed further investigation of inter‐observer variability in clinical dataset. The correlation between score and DSC/MDA was studied on the ROIs with 10 or more counts of each acceptable score (1, 2, 3).ResultsFor model I/model II: the mean score was 3.63/1.30 for transparent/opaque spacer, 2.71/2.16 for prostate, 3.25/2.44 for proximal SVs, 1.13/1.02 for both femurs, 2.25/1.25 for bladder, 3.00/2.06 for rectum, 3.38/2.42 for penile bulb, and 2.79/2.20 for the composite set; the mean DSC was 0.52/0.84 for spacer, 0.84/0.85 for prostate, 0.60/0.62 for proximal SVs, 0.94/0.96 for left femur, 0.95/0.96 for right femur, 0.91/0.95 for bladder, 0.81/0.84 for rectum, and 0.65/0.65 for penile bulb; and the mean MDA was 2.9/0.9 mm for spacer, 1.9/1.7 mm for prostate, 2.4/2.3 mm for proximal SVs, 0.8/0.5 mm for left femur, 0.7/0.5 mm for right femur, 1.5/0.9 mm for bladder, 2.3/1.9 mm for rectum, and 2.2/2.2 mm for penile bulb. Model II showed significantly improved scores for all ROIs, and metrics for spacer, femurs, bladder, and rectum. Significant inter‐observer variability was only found for prostate. Highly linear correlation between the score and DSC was found for the two qualified ROIs (prostate and rectum).ConclusionsThe overall efficiency gain was meaningful for model I and substantial for model II. The ROIs meeting the clinical deployment criteria (mean score below 3.25, DSC above 0.8, and MDA below 2.5 mm) included prostate, both femurs, bladder and rectum for both models, and spacer for model II.