Abstract
Breast cancer is currently one of the most prevalent cancers affecting women globally. Uncontrolled growth and division of breast cells lead to the formation of tumors, marking the onset of breast cancer. Predicting breast cancer is essential for early detection, making treatment plans, and implementing preventive measures, ultimately improving patient outcomes and reducing mortality rates. In recent years, numerous studies have been published to predict breast cancer where researchers use a variety of methods. Most investigations have been conducted using narrow and specific datasets, often resulting in a lack of accuracy. Such methods may not be suitable for clinical use. The study aims to address the limitations of existing models in terms of robustness and generalization across diverse datasets. In our study, we employed two metaheuristic algorithms, namely, genetic algorithm (GA) and chemical reaction optimization (CRO) with machine learning techniques, including support vector machine (SVM), decision tree, random forest, and XGBoost. GA and CRO are used to optimize the feature selection process. It enables machine learning algorithms to predict more accurately. Experiments were conducted on three datasets, namely, Wisconsin Breast Cancer (WBC), Breast Cancer‐the University of California, Irvine (BC‐UCI), and Breast Cancer Coimbra (BCC) datasets. The datasets contain 569, 286, and 116 instances, respectively. The classifiers with optimized features consistently outperformed the classifiers without feature optimization in terms of accuracy, precision, recall, specificity, and F1 score. Among the compared methods published recently, our method attained the highest accuracies of 99.64% in the WBC dataset and 98% in the BCC dataset, as well as the second highest accuracy of 99.12% in the BC‐UCI dataset. Comparative analysis demonstrated the superiority of our approach over existing methods.