Automatic detection of Breast Cancer by using Ensemble Learning

Author:

León Carlos de la Cruz1,Agarwal Deevyankar1,Torre-Díez Isabel de la1,Río-Solá M Lourdes2

Affiliation:

1. University of Valladolid

2. University Hospital of Valladolid

Abstract

Abstract Breast cancer is a significant health problem, with about 2 million new cases annually diagnosed and 600,000 deaths. Early detection and accurate diagnosis are critical to patient prognosis. Machine learning (ML) models show promising results in accurate and efficient diagnosis. In the present work, the performance of different models of ML are studied in the publicly accessible online dataset "Wisconsin Breast Cancer Dataset". Those models are formed by logistic regressions, Random Forest, Naïve Bayes, and Support Vector Machine algorithms, being the last one the best performing. An ensemble model combining the best proposed models is then implemented. An SVM model with standardized dataset is used, a logistic regression model with standardized dataset and 10-component PCA analysis. A Random Forest model with standardized dataset and 60 estimators. All models use a test dataset formed by 30% of the original dataset. The models are combined using a majority weighted voting system. The SVM model has a weight of 0.5 while the regression and Random Forest models have weights of 0.25. The ensemble voting model manages to improve the results of the individual models with an accuracy of 98%, precision of 97%, recall of 99% and F1 score of 98%.

Publisher

Research Square Platform LLC

Reference28 articles.

1. 1. B. S. Chhikara and K. Parang, “Global Cancer Statistics 2022: the trends projection analysis,” Chemical Biology Letters, vol. 10, no. 1, p. 451, 2023.

2. 2. “Breast Cancer Statistics, American Cancer Society.” American Cancer Society. [Online]. Available: https://www.cancer.org/cancer/breast-cancer/about/how-common-is-breast-cancer.html

3. 3. T. B. Bevers et al., “Breast cancer screening and diagnosis,” Journal of the National Comprehensive Cancer Network, vol. 7, no. 10, pp. 1060–1096, 2009.

4. 4. G. D. Magoulas and A. Prentza, “Machine learning in medical applications,” Machine Learning and Its Applications: advanced lectures, pp. 300–307, 2001.

5. 5. K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V Karamouzis, and D. I. Fotiadis, “Machine learning applications in cancer prognosis and prediction,” Comput Struct Biotechnol J, vol. 13, pp. 8–17, 2015.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3