PREDICTING LUNG CANCER USING EXPLAINABLE ARTIFICIAL INTELLIGENCE AND BORUTA-SHAP METHODS
Author:
Akkur Erkan1ORCID, Öztürk Ahmet Cankat2ORCID
Affiliation:
1. Türkiye İlaç ve Tıbbi Cihaz Kurumu 2. Savunma Sanayii Başkanlığı
Abstract
Machine learning algorithms, a popular approach for disease prediction in recent years, can also be used to predict lung cancer, which has fatal effects. A prediction model based on machine learning algorithms is proposed to predict lung cancer. Five decision tree-based algorithms were preferred as classifiers. The experiment was conducted on a publicly available data set that contained risk factors. The Boruta-SHAP approach was employed to reveal the most salient features in the dataset. The use of the feature selection method improved the performance of the classifiers in the prediction process. Experiments were conducted using all features and reduced features separately. When comparing all the classifiers' performances, the XGBoost algorithm produced the best prediction rate with an accuracy of 97.22% and an AUROC of 0.972. The proposed model has a good classification rate compared to similar studies in the literature. We used the SHAP (SHapley Additive exPlanation) approach to investigate the effect of risk factors in the dataset on the model output. As a result, allergy was found to be the most significant risk factor for this disease.
Publisher
Kahramanmaras Sutcu Imam University Journal of Engineering Sciences
Reference33 articles.
1. Sung, H., Ferlay, J., Siegel, R. L., Laversanne, M., Soerjomataram, I., Jemal, A., & Bray, F. (2021). Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians, 71(3), 209-249. 2. Li, C., Lei, S., Ding, L., Xu, Y., Wu, X., Wang, H., Zhang, Z., Gao, T., Zhang, Y., Li, L. (2023). Global burden and trends of lung cancer incidence and mortality. Chin Med J (Engl), 136(13):1583-1590 3. Latimer, K. M., & Mott, T. F. (2015). Lung cancer: diagnosis, treatment principles, and screening. American family physician, 91(4), 250-256. 4. Kaplanoglu, E., & Nasab, A. (2023). Evaluation of artificial intelligence techniques in disease diagnosis and prediction. Discover Artificial Intelligence, 3(1). 5. Turk, F. &. Kokver, Y. (2022). Application with deep learning models for COVID-19 diagnosis, SAUCIS, vol. 5, no. 2, pp. 169-180.
|
|