Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors

Author:

Su Yu-Ru1ORCID,Buist Diana S.M.1ORCID,Lee Janie M.2ORCID,Ichikawa Laura1ORCID,Miglioretti Diana L.13ORCID,Bowles Erin J. Aiello1ORCID,Wernli Karen J.1ORCID,Kerlikowske Karla456ORCID,Tosteson Anna7ORCID,Lowry Kathryn P.2ORCID,Henderson Louise M.8ORCID,Sprague Brian L.910ORCID,Hubbard Rebecca A.11ORCID

Affiliation:

1. 1Kaiser Permanente Washington Health Research Institute, Kaiser Permanente WA, Seattle, Washington.

2. 2Department of Radiology, University of Washington and Seattle Cancer Care Alliance, Seattle, Washington.

3. 3Division of Biostatistics, Department of Public Health Sciences, University of California Davis, Davis, California.

4. 4Department of Medicine, University of California, San Francisco, California.

5. 5Department of Epidemiology and Biostatistics, University of California, San Francisco, California.

6. 6General Internal Medicine Section, Department of Veterans Affairs, University of California, San Francisco, California.

7. 7The Dartmouth Institute for Health Policy and Clinical Practice and Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire.

8. 8Department of Radiology, University of North Carolina, Chapel Hill, North Carolina.

9. 9Department of Surgery, University of Vermont, Burlington, Vermont.

10. 10Department of Radiology, University of Vermont, Burlington, Vermont.

11. 11Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania.

Abstract

AbstractBackground:Machine learning (ML) approaches facilitate risk prediction model development using high-dimensional predictors and higher-order interactions at the cost of model interpretability and transparency. We compared the relative predictive performance of statistical and ML models to guide modeling strategy selection for surveillance mammography outcomes in women with a personal history of breast cancer (PHBC).Methods:We cross-validated seven risk prediction models for two surveillance outcomes, failure (breast cancer within 12 months of a negative surveillance mammogram) and benefit (surveillance-detected breast cancer). We included 9,447 mammograms (495 failures, 1,414 benefits, and 7,538 nonevents) from years 1996 to 2017 using a 1:4 matched case–control samples of women with PHBC in the Breast Cancer Surveillance Consortium. We assessed model performance of conventional regression, regularized regressions (LASSO and elastic-net), and ML methods (random forests and gradient boosting machines) by evaluating their calibration and, among well-calibrated models, comparing the area under the receiver operating characteristic curve (AUC) and 95% confidence intervals (CI).Results:LASSO and elastic-net consistently provided well-calibrated predicted risks for surveillance failure and benefit. The AUCs of LASSO and elastic-net were both 0.63 (95% CI, 0.60–0.66) for surveillance failure and 0.66 (95% CI, 0.64–0.68) for surveillance benefit, the highest among well-calibrated models.Conclusions:For predicting breast cancer surveillance mammography outcomes, regularized regression outperformed other modeling approaches and balanced the trade-off between model flexibility and interpretability.Impact:Regularized regression may be preferred for developing risk prediction models in other contexts with rare outcomes, similar training sample sizes, and low-dimensional features.

Funder

National Institutes of Health

Patient-Centered Outcomes Research Institute

Agency for Healthcare Research and Quality

Publisher

American Association for Cancer Research (AACR)

Subject

Oncology,Epidemiology

Reference57 articles.

1. Machine learning approaches to predict 6-month mortality among patients with cancer;Parikh;JAMA Netw Open,2019

2. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges;Goldstein;Eur Heart J,2016

3. Machine learning-based lifetime breast cancer risk reclassification compared with the BOADICEA model: impact on screening recommendations;Ming;Br J Cancer,2020

4. Machine learning algorithms performed no better than regression models for prognostication in traumatic brain injury;Gravesteijn;J Clin Epidemiol,2020

5. Logistic regression was as good as machine learning for predicting major chronic diseases;Nusinovici;J Clin Epidemiol,2020

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3