Author:
Islam Taminul,Sheakh Md. Alif,Tahosin Mst. Sazia,Hena Most. Hasna,Akash Shopnil,Bin Jardan Yousef A.,FentahunWondmie Gezahign,Nafidi Hiba-Allah,Bourhia Mohammed
Abstract
AbstractBreast cancer has rapidly increased in prevalence in recent years, making it one of the leading causes of mortality worldwide. Among all cancers, it is by far the most common. Diagnosing this illness manually requires significant time and expertise. Since detecting breast cancer is a time-consuming process, preventing its further spread can be aided by creating machine-based forecasts. Machine learning and Explainable AI are crucial in classification as they not only provide accurate predictions but also offer insights into how the model arrives at its decisions, aiding in the understanding and trustworthiness of the classification results. In this study, we evaluate and compare the classification accuracy, precision, recall, and F1 scores of five different machine learning methods using a primary dataset (500 patients from Dhaka Medical College Hospital). Five different supervised machine learning techniques, including decision tree, random forest, logistic regression, naive bayes, and XGBoost, have been used to achieve optimal results on our dataset. Additionally, this study applied SHAP analysis to the XGBoost model to interpret the model’s predictions and understand the impact of each feature on the model’s output. We compared the accuracy with which several algorithms classified the data, as well as contrasted with other literature in this field. After final evaluation, this study found that XGBoost achieved the best model accuracy, which is 97%.
Publisher
Springer Science and Business Media LLC
Reference56 articles.
1. Park, M. Y. et al. Function and application of flavonoids in the breast cancer. Int. J. Mol. Sci. 23, 7732 (2022).
2. (1) (PDF) Breast cancer detection based on thermographic images using machine learning and deep learning algorithms. https://www.researchgate.net/publication/361228083_Breast_cancer_detection_based_on_thermographic_images_using_machine_learning_and_deep_learning_algorithms.
3. Uddin, K. M. M., Biswas, N., Rikta, S. T. & Dey, S. K. Machine learning-based diagnosis of breast cancer utilizing feature optimization technique. Comput. Methods Progr. Biomed. Update 3, 100098 (2023).
4. Adekeye, A., Lung, K. C. & Brill, K. L. Pediatric and adolescent breast conditions: A review. J. Pediatr. Adolesc. Gynecol. 36, 5–13 (2023).
5. Siegel Mph, R. L. et al. Cancer statistics, 2023. pathologyinnovationcc.orgRL Siegel, KD Miller, NS Wagle, A JemalCa Cancer J Clin, 2023•pathologyinnovationcc.org 73, 17–48 (2023).