Abstract
AbstractPurpose/Objective(s)Here we investigate an approach to develop and clinically validate auto-contouring models for lymph node levels and structures of deglutition and mastication in the head and neck. An objective of this work is to provide high quality resources to the scientific community to promote advancement of treatment planning, clinical trial management, and toxicity studies for the head and neck.Materials/MethodsCTs of 145 patients who were irradiated for a head and neck primary malignancy at MD Anderson Cancer Center were retrospectively curated. Data were contoured by radiation oncologists and a resident physician and divided into two separate cohorts. One cohort was used to analyze lymph node levels (IA, IB, II, III, IV, V, RP) and the other used to analyze 17 swallowing and chewing structures. Forty-seven patients were in the lymph node level cohort (training/testing = 32/15). All these patients received definitive radiotherapy without a nodal dissection to minimize anatomic perturbation of the lymph node levels. The remaining 98 patients formed the swallowing/chewing structures cohort (training/testing =78/20). Separate nnUnet models were trained and validated using the separate cohorts. For the lymph node levels, two double blinded studies were used to score preference and clinical acceptability (using a 5-point Likert scale) of AI vs human contours. For the swallowing and chewing structures, clinical acceptability was scored. Quantitative analyses of the test sets were performed for AI vs human contours for all structures using the Dice Similarity Coefficient (DSC) and the 95208percentile Hausdorff distance (HD95th).ResultsAcross all lymph node levels (IA, IB, II, III, IV, V, RP), median DSC ranged from 0.77 to 0.89 for AI vs manual contours in the testing cohort. Across all lymph node levels, the AI contour was superior to or equally preferred to the manual contours at rates ranging from 75% to 91% in the first blinded study. In the second blinded study, physician preference for the manual vs AI contour was statistically different for only the RP contours (p < 0.01). Thus, there was not a significant difference in clinical acceptability for nodal levels I-V for manual versus AI contours. Across all physician-generated contours, 82% were rated as usable with stylistic to no edits, and across all AI-generated contours, 92% were rated as usable with stylistic to no edits. For the swallowing structures median DSC ranged from 0.86 to 0.96 and was greater than 0.90 for 11/17 structures types. Of the 340 contours in the test set, only 4% required minor edits.ConclusionsAn approach to generate clinically acceptable automated contours for lymph node levels and swallowing and chewing structures in the head and neck was demonstrated. For nodal levels I-V, there was no significant difference in clinical acceptability in manual vs AI contours. Of the two testing cohorts for lymph nodes and swallowing and chewing structures, only 8% and 4% of structures required minor edits, respectively. All testing and training data are being made publicly available on The Cancer Imaging Archive.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献