Abstract
Background
To assess the predictive accuracy of advanced AI language models and established clinical scales in prognosticating outcomes for patients with aneurysmal subarachnoid hemorrhage (aSAH).
Methods
This retrospective cohort study included 82 patients suffering from aSAH. We evaluated the predictive efficacy of AtlasGPT and ChatGPT 4.0 by examining the area under the curve (AUC), sensitivity, specificity, and Youden's Index, in comparison to established clinical grading scales such as the World Federation of Neurological Surgeons (WFNS) scale, Simplified Endovascular Brain Edema Score (SEBES), and Fisher scale. This assessment focused on four endpoints: in-hospital mortality, need for decompressive hemicraniectomy, and functional outcomes at discharge and after 6-month follow-up.
Results
In-hospital mortality occurred in 22% of the cohort, and 34.1% required decompressive hemicraniectomy during treatment. At hospital discharge, 28% of patients exhibited a favorable outcome (mRS ≤ 2), which improved to 46.9% at the 6-month follow-up. Prognostication utilizing the WFNS grading scale for 30-day in-hospital survival revealed an AUC of 0.72 with 59.4% sensitivity and 83.3% specificity. AtlasGPT provided the highest diagnostic accuracy (AUC 0.80, 95% CI: 0.70–0.91) for predicting the need for decompressive hemicraniectomy, with 82.1% sensitivity and 77.8% specificity. Similarly, for discharge outcomes, the WFNS score and AtlasGPT demonstrated high prognostic values with AUCs of 0.74 and 0.75, respectively. Long-term functional outcome predictions were best indicated by the WFNS scale, with an AUC of 0.76.
Conclusions
The study demonstrates the potential of integrating AI models such as AtlasGPT with clinical scales to enhance outcome prediction in aSAH patients. While established scales like WFNS remain reliable, AI language models show promise, particularly in predicting the necessity for surgical intervention and short-term functional outcomes.