Abstract
ABSTRACTBACKGROUNDThe risk prediction of stroke recurrence for individual patients is a difficult task. Individualised prediction may enhance stroke survivors selfcare engagement. We have developed PRERISK: a statistical and Machine Learning (ML) classifier to predict individual stroke recurrence risk.METHODSWe analysed clinical and socioeconomic data from a prospectively collected public healthcare-based dataset of 44623 patients admitted with stroke diagnosis in 88 public hospitals over 6 years in Catalonia-Spain. We trained several supervised-ML models to provide individualised risk along time and compared them with a Cox regression model.RESULTSOverall, 16% of patients presented a stroke recurrence along a median follow-up of 2.65 years. Models were trained for predicting early, late and long-term recurrence risk, within 90, 91-365 and >365 days, respectively. Most powerful predictors of stroke recurrence were time since index stroke, Barthel index, atrial fibrillation, dyslipidemia, haemoglobin and body mass index, which were used to create a simplified model with similar performance. The balanced AUROC were 0.77 (±0.01), 0.61 (±0.01) and 0.71 (±0.01) for early, late and long-term recurrence risk respectively (Cox risk class probability: 0.74(±0.01), 0.59(±0.01) and 0.68(±0.01), c-index 0.88). Overall, the ML approach showed statistically significant improvement over the Cox model. Stroke recurrence curves can be simulated for each patient under different degrees of control of modifiable factors.CONCLUSIONPRERISK represents a novel approach that provides continuous, personalised and fairly accurate risk prediction of stroke recurrence along time according to the degree of modifiable risk factors control.CLINICAL PERSPECTIVEWhat is new?Stroke recurrence is frequent after stroke despite advances in stroke treatments, and it is difficult to predict the individual risk of one patient.We have created PRERISK, a predictive model based on machine learning (ML) which provides individualised information of the probability of stroke recurrence and can be re-calculated according to risk factors control.What are the clinical implications?PRERISK information can be used as feedback for secondary prevention strategies and enhance patient engagement and treatment compliance.It could be scalable to optimise ML-based prevention strategies in other chronic conditions.
Publisher
Cold Spring Harbor Laboratory