Affiliation:
1. University of Queensland
Abstract
Abstract
Background
There are many machine learning (ML) models which predict acute kidney injury (AKI) for hospitalised patients. While a primary goal of these models is to support clinicians with better decision-making in hospitals, the adoption of different methods of estimating baseline serum creatinine (sCr) can result in establishing inconsistent ground truth when estimating AKI incidence. The real-world utility of such models is therefore often an issue given the high rate of false positive predictions which can result in negative clinical outcomes.
Objective
The first aim of this study was to develop and assess the performance of ML models using three different methods of estimating baseline sCr. The second aim was to conduct an error analysis to reduce the rate of false positives.
Materials and Methods
For both aims, the Intensive Care Unit (ICU) patients of the Medical Information Mart for Intensive Care (MIMIC)-IV dataset with the KDIGO (Kidney Disease Improving Global Outcome) definition was used to identify AKI episodes using three different methods of estimating baseline sCr. ML models were developed for each cohort and the performance of the models was compared. Explainability methods were used to analyse the XGBoost errors.
Results
The baseline, defined as the mean of sCr in 180 to 7 days prior to ICU, yielded the highest performance metrics with the XGBoost model. Using the explainability methods, the mean of sCr in 180 to 0 days pre-ICU led to a further reduction in FP rate, with the highest AUC of 0.86, recall of 0.61, precision of 0.56 and f1 score of 0.58. The cohort size was 31,586 admissions, of which 5,473 (17.32%) had AKI.
Conclusion
To enable the effective use of AI in AKI prediction and management, a clinically relevant and widely applicable standard method for baseline sCr is needed. In healthcare, the utilisation of explainability techniques can aid AI developers and end users in comprehending how AI models are making predictions. We concluded that ML development with model-driven and data-driven architectures can be effective in minimizing the occurrence of false positives. This can augment the success rate of ML implementation in routine care.
Publisher
Research Square Platform LLC