BACKGROUND
Background: Suicide is an important and pressing avenue for public health and Machine Learning (ML) models can be used to help identify individuals at risk. Leveraging transfer-learning from pre-trained language models (LMs) to predict suicidal ideation and behaviors in speech and text is promising, according to studies using benchmark datasets and realworld social media data.
OBJECTIVE
Objective: We set out to i) develop and apply ML methods in predicting suicidal ideation and behaviors in a real-world crisis-helpline dataset, using transformer-based pretrained models as a building block ii) evaluate, cross-validate, and benchmark the model against traditional text classification approaches, and iii) train an explainer model, informing about relevant risk-associated features.
METHODS
Methods: We used chat protocols from youth, aged 14 to 25, seeking help from a German crisis helpline, to train a machine learning (ML) model, utilizing a transformer-based language model architecture with pre-trained weights combined with Long-Short-Term-Memory-Layers. We predicted Suicidal Ideation (SI) and Advanced Suicidal Engagement(ASE), indicated by composite Columbia-Suicide-Severity-Rating Scale (C-SSRS) scores, and compared predictions against those of a classical word-vector based ML model. We then obtained discrimination, calibration, clinical utility and explainability information using a Shapley value-based post-hoc estimation (SHAP) model.
RESULTS
Results: Based on data from 1,348 help-seeking encounters, the transformer-based classifier yielded a macro-averaged area under the curve (AUC) of 0.93 (95% CI [0.87, 0.99]) and a macro-averaged F1 score of 0.79 (95% confidence interval [CI] [0.60, 0.96]). It outperformed the word-vector-based baseline model (AUC = 0.77; 95% CI [0.63, 0.89]; F1 score = 0.56; 95% CI [0.0, 0.65]). The SHAP model highlighted language features like 'I-talk,' phrases indicating low self-esteem and self-hatred, lethal means, hopelessness, and body issues as predictive of suicidal ideation and behaviors.
CONCLUSIONS
Conclusions: Neural Networks, using LM-based transfer learning, can effectively identify suicidal ideation and advanced suicidal engagement. The explainer model additionally revealed language features associated with respective suicidal phenomena. Such models may potentially support clinical decision-making in the context of suicide prevention services.