Author:
Wang Yiling,Lombardo Elia,Huang Lili,Avanzo Michele,Fanetti Giuseppe,Franchin Giovanni,Zschaeck Sebastian,Weingärtner Julian,Belka Claus,Riboldi Marco,Kurz Christopher,Landry Guillaume
Abstract
Abstract
Objectives
Deep learning-based auto-segmentation of head and neck cancer (HNC) tumors is expected to have better reproducibility than manual delineation. Positron emission tomography (PET) and computed tomography (CT) are commonly used in tumor segmentation. However, current methods still face challenges in handling whole-body scans where a manual selection of a bounding box may be required. Moreover, different institutions might still apply different guidelines for tumor delineation. This study aimed at exploring the auto-localization and segmentation of HNC tumors from entire PET/CT scans and investigating the transferability of trained baseline models to external real world cohorts.
Methods
We employed 2D Retina Unet to find HNC tumors from whole-body PET/CT and utilized a regular Unet to segment the union of the tumor and involved lymph nodes. In comparison, 2D/3D Retina Unets were also implemented to localize and segment the same target in an end-to-end manner. The segmentation performance was evaluated via Dice similarity coefficient (DSC) and Hausdorff distance 95th percentile (HD95). Delineated PET/CT scans from the HECKTOR challenge were used to train the baseline models by 5-fold cross-validation. Another 271 delineated PET/CTs from three different institutions (MAASTRO, CRO, BERLIN) were used for external testing. Finally, facility-specific transfer learning was applied to investigate the improvement of segmentation performance against baseline models.
Results
Encouraging localization results were observed, achieving a maximum omnidirectional tumor center difference lower than 6.8 cm for external testing. The three baseline models yielded similar averaged cross-validation (CV) results with a DSC in a range of 0.71–0.75, while the averaged CV HD95 was 8.6, 10.7 and 9.8 mm for the regular Unet, 2D and 3D Retina Unets, respectively. More than a 10% drop in DSC and a 40% increase in HD95 were observed if the baseline models were tested on the three external cohorts directly. After the facility-specific training, an improvement in external testing was observed for all models. The regular Unet had the best DSC (0.70) for the MAASTRO cohort, and the best HD95 (7.8 and 7.9 mm) in the MAASTRO and CRO cohorts. The 2D Retina Unet had the best DSC (0.76 and 0.67) for the CRO and BERLIN cohorts, and the best HD95 (12.4 mm) for the BERLIN cohort.
Conclusion
The regular Unet outperformed the other two baseline models in CV and most external testing cohorts. Facility-specific transfer learning can potentially improve HNC segmentation performance for individual institutions, where the 2D Retina Unets could achieve comparable or even better results than the regular Unet.
Funder
Sichuan Province Science and Technology Support Program
China Postdoctoral Science Foundation
Fundamental Research Funds for the Central Universities
German Research Foundation (DFG), Research Training Group GRK
Förderprogramm für Forschung und Lehre, Medical Faculty, LMU Munich
Publisher
Springer Science and Business Media LLC
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献