Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning

Author:

Yang Pei-Tse1,Wu Wen-Shuo1,Wu Chia-Chun1,Shih Yi-Nuo2,Hsieh Chung-Ho3,Hsu Jia-Lien1

Affiliation:

1. Department of Computer Science and Information Engineering, Fu Jen Catholic University , New Taipei City , Taiwan , Republic of China

2. Department of Occupational Therapy, Fu Jen Catholic University , New Taipei City , Taiwan , Republic of China

3. Department of General Surgery, Shin Kong Wu Ho-Su Memorial Hospital , Taipei , Taiwan , Republic of China

Abstract

Abstract Breast cancer is one of the most common cancers in women all over the world. Due to the improvement of medical treatments, most of the breast cancer patients would be in remission. However, the patients have to face the next challenge, the recurrence of breast cancer which may cause more severe effects, and even death. The prediction of breast cancer recurrence is crucial for reducing mortality. This paper proposes a prediction model for the recurrence of breast cancer based on clinical nominal and numeric features. In this study, our data consist of 1,061 patients from Breast Cancer Registry from Shin Kong Wu Ho-Su Memorial Hospital between 2011 and 2016, in which 37 records are denoted as breast cancer recurrence. Each record has 85 features. Our approach consists of three stages. First, we perform data preprocessing and feature selection techniques to consolidate the dataset. Among all features, six features are identified for further processing in the following stages. Next, we apply resampling techniques to resolve the issue of class imbalance. Finally, we construct two classifiers, AdaBoost and cost-sensitive learning, to predict the risk of recurrence and carry out the performance evaluation in three-fold cross-validation. By applying the AdaBoost method, we achieve accuracy of 0.973 and sensitivity of 0.675. By combining the AdaBoost and cost-sensitive method of our model, we achieve a reasonable accuracy of 0.468 and substantially high sensitivity of 0.947 which guarantee almost no false dismissal. Our model can be used as a supporting tool in the setting and evaluation of the follow-up visit for early intervention and more advanced treatments to lower cancer mortality.

Publisher

Walter de Gruyter GmbH

Subject

General Medicine

Reference46 articles.

1. World Health Organization. WHO position paper on mammography screening [Internet]. Switzerland: World Health Organization; 2014. Available From: https://apps.who.int/iris/handle/10665/137339

2. American Cancer Society. Cancer facts & figures 2020 [Internet]. Atlanta: American Cancer Society; 2020. Available From: https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/annual-cancer-facts-and-figures/2020/cancer-facts-and-figures-2020.pdf

3. Kim J, Shin H. Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data. J Am Med Inf Assoc. 2013;20(4):613–8. 10.1136/amiajnl-2012-001570. PubMed PMID: 23467471; PubMed Central PMCID: PMC3721173.

4. Hsu JL, Hung PC, Lin HY, Hsieh CH. Applying under-sampling techniques and cost-sensitive learning methods on risk assessment of breast cancer. J Med Syst. 2015 Apr;39(4):1–3. 10.1007/s10916-015-0210-x. PubMed PMID: 25712814.

5. Seely JM, Alhassan T. Screening for breast cancer in 2018-what should we be doing today? Curr Oncol. 2018 Jun;25(Suppl 1):S115–24. 10.374/co.25.3770. PubMed PMID:29910654; PubMed Central PMCID: PMC6001765.

Cited by 18 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3