Affiliation:
1. Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
Abstract
Kidney stone disease is a widespread urological disorder affecting millions globally. Timely diagnosis is crucial to avoid severe complications. Traditionally, renal stones are detected using computed tomography (CT), which, despite its effectiveness, is costly, resource-intensive, exposes patients to unnecessary radiation, and often results in delays due to radiology report wait times. This study presents a novel approach leveraging machine learning to detect renal stones early using routine laboratory test results. We utilized an extensive dataset comprising 2156 patient records from a Saudi Arabian hospital, featuring 15 attributes with challenges such as missing data and class imbalance. We evaluated various machine learning algorithms and imputation methods, including single and multiple imputations, as well as oversampling and undersampling techniques. Our results demonstrate that ensemble tree-based classifiers, specifically random forest (RF) and extra tree classifiers (ETree), outperform others with remarkable accuracy rates of 99%, recall rates of 98%, and F1 scores of 99% for RF, and 92% for ETree. This study underscores the potential of non-invasive, cost-effective laboratory tests for renal stone detection, promoting prompt and improved medical support.
Funder
Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, Saudi Arabia, under Grant No.
Reference22 articles.
1. Public Awareness towards Renal Stone Causes, Symptoms and Management amongst Saudis;Almuhanna;Egypt. J. Hosp. Med.,2018
2. Adversarial machine learning;Vorobeychik;Synth. Lect. Artif. Intell. Mach. Learn.,2018
3. Missing value imputation based on k-mean clustering with weighted distance;Patil;Commun. Comput. Inf. Sci.,2010
4. Comprehensive Techniques for Handling Missing Data in Healthcare Research;Smith;J. Biomed. Inform.,2017
5. Allison, P.D. (2012, January 22–25). Handling missing data by maximum likelihood. Proceedings of the SAS Global Forum 2012 Statistics and Data Analysis, Orlando, FL, USA.