Multi-institutional Prognostic Modeling in Head and Neck Cancer: Evaluating Impact and Generalizability of Deep Learning and Radiomics

Author:

Kazmierski Michal12ORCID,Welch Mattea123ORCID,Kim Sejin12ORCID,McIntosh Chris134ORCID,Rey-McIntyre Katrina4ORCID,Huang Shao Hui45ORCID,Patel Tirth34ORCID,Tadic Tony45ORCID,Milosevic Michael345ORCID,Liu Fei-Fei45ORCID,Ryczkowski Adam67ORCID,Kazmierska Joanna78ORCID,Ye Zezhong910ORCID,Plana Deborah910ORCID,Aerts Hugo J.W.L.91011ORCID,Kann Benjamin H.910ORCID,Bratman Scott V.145ORCID,Hope Andrew J.45ORCID,Haibe-Kains Benjamin12ORCID

Affiliation:

1. 1Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.

2. 2Princess Margaret Cancer Centre, Toronto, Ontario, Canada.

3. 3TECHNA Institute, Toronto, Ontario, Canada.

4. 4Radiation Medicine Program, Princess Margaret Cancer Centre, Toronto, Ontario, Canada.

5. 5Department of Radiation Oncology, University of Toronto, Ontario, Canada.

6. 6Department of Medical Physics, Greater Poland Cancer Centre, Poznan, Poland.

7. 7Department of Electroradiology, University of Medical Sciences, Poznan, Poland.

8. 8Department of Radiotherapy II, Greater Poland Cancer Centre, Poznan, Poland.

9. 9Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, Massachusetts.

10. 10Department of Radiation Oncology, Dana-Farber Cancer Institute / Brigham and Women's Hosptial, Boston, Massachusetts.

11. 11Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands.

Abstract

Artificial intelligence (AI) and machine learning (ML) are becoming critical in developing and deploying personalized medicine and targeted clinical trials. Recent advances in ML have enabled the integration of wider ranges of data including both medical records and imaging (radiomics). However, the development of prognostic models is complex as no modeling strategy is universally superior to others and validation of developed models requires large and diverse datasets to demonstrate that prognostic models developed (regardless of method) from one dataset are applicable to other datasets both internally and externally. Using a retrospective dataset of 2,552 patients from a single institution and a strict evaluation framework that included external validation on three external patient cohorts (873 patients), we crowdsourced the development of ML models to predict overall survival in head and neck cancer (HNC) using electronic medical records (EMR) and pretreatment radiological images. To assess the relative contributions of radiomics in predicting HNC prognosis, we compared 12 different models using imaging and/or EMR data. The model with the highest accuracy used multitask learning on clinical data and tumor volume, achieving high prognostic accuracy for 2-year and lifetime survival prediction, outperforming models relying on clinical data only, engineered radiomics, or complex deep neural network architecture. However, when we attempted to extend the best performing models from this large training dataset to other institutions, we observed significant reductions in the performance of the model in those datasets, highlighting the importance of detailed population-based reporting for AI/ML model utility and stronger validation frameworks. We have developed highly prognostic models for overall survival in HNC using EMRs and pretreatment radiological images based on a large, retrospective dataset of 2,552 patients from our institution.Diverse ML approaches were used by independent investigators. The model with the highest accuracy used multitask learning on clinical data and tumor volume.External validation of the top three performing models on three datasets (873 patients) with significant differences in the distributions of clinical and demographic variables demonstrated significant decreases in model performance. Significance: ML combined with simple prognostic factors outperformed multiple advanced CT radiomics and deep learning methods. ML models provided diverse solutions for prognosis of patients with HNC but their prognostic value is affected by differences in patient populations and require extensive validation.

Funder

Canadian HIV Trials Network, Canadian Institutes of Health Research

Publisher

American Association for Cancer Research (AACR)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3