BACKGROUND
The healthcare system is undergoing a shift towards a more patient-centred approach for individuals with chronic and complex conditions, which presents a series of challenges, such as predicting hospital needs and optimizing resources. At the same time, the exponential increase in health data availability has made it possible to apply advanced statistics and artificial intelligence techniques to develop decision-support systems and improve resources planning’ efficiency, diagnosis and patient screening. These methods are key to automating the analysis of large volumes of medical data and reducing professional’s workload.
OBJECTIVE
This article aims to present a machine learning model and a case study in a cohort of highly complex patients to predict death over the following 4 years and early death over the following 6 months since the complexity diagnosing. The method uses easily accessible variables and healthcare resource utilization information.
METHODS
A classification algorithm is selected among six models implemented and evaluated using a stratified cross-validation strategy with k=10 and a 70/30 train-test split. The evaluation metrics used are accuracy, recall, precision, F1-score and area under the curve (AUC-ROC).
RESULTS
The model predicts patient death with an 87% accuracy (recall=0.87, precision=0.82, F1=0.84, AUC=0.88) using the best model, the Extreme Gradient-Boosting classifier (XG-Boost). The results are worse when predicting premature deaths (following 6 months) with an 83% accuracy (recall=0.55, precision=0.64, F1=0.57, AUC=0.88) using the Gradient Boosting Classifier (GR-Boost).
CONCLUSIONS
This study showcases encouraging outcomes in forecasting mortality among patients with intricate and persistent health conditions. The employed variables are conveniently accessible and the incorporation of healthcare resource utilization information of the patient, which has not been employed by current state-of-the-art approaches, display promising predictive power. The proposed prediction model is designed to efficiently identify cases that need customized care and to proactively anticipate the demand for critical resources by healthcare providers.