Abstract
Objective
Preventing suicide in US youth is of paramount concern, with rates increasing over 50% between 2007 and 2018. Statistical modeling using electronic health records may help identify at-risk youth before a suicide attempt. While electronic health records contain diagnostic information, which are known risk factors, they generally lack or poorly document social determinants (e.g., social support), which are also known risk factors. If statistical models are built incorporating not only diagnostic records, but also social determinants measures, additional at-risk youth may be identified before a suicide attempt.
Methods
Suicide attempts were predicted in hospitalized patients, ages 10–24, from the State of Connecticut’s Hospital Inpatient Discharge Database (HIDD; N = 38943). Predictors included demographic information, diagnosis codes, and using a data fusion framework, social determinants features transferred or fused from an external source of survey data, The National Longitudinal Study of Adolescent to Adult Health (Add Health). Social determinant information for each HIDD patient was generated by averaging values from their most similar Add Health individuals (e.g., top 10), based upon matching shared features between datasets (e.g., Pearson’s r). Attempts were then modelled using an elastic net logistic regression with both HIDD features and fused Add Health features.
Results
The model including fused social determinants outperformed the conventional model (AUC = 0.83 v. 0.82). Sensitivity and positive predictive values at 90 and 95% specificity were almost 10% higher when including fused features (e.g., sensitivity at 90% specificity = 0.48 v. 0.44). Among social determinants variables, the perception that their mother cares and being non-religious appeared particularly important to performance improvement.
Discussion
This proof-of-concept study showed that incorporating social determinants measures from an external survey database could improve prediction of youth suicide risk from clinical data using a data fusion framework. While social determinant data directly from patients might be ideal, estimating these characteristics via data fusion avoids the task of data collection, which is generally time-consuming, expensive, and suffers from non-compliance.
Funder
National Institute of Mental Health
Publisher
Public Library of Science (PLoS)
Reference33 articles.
1. Centers for Disease Control and Prevention. Web-based Injury Statistics Query and Reporting System (WISQARS) [online]. National Center for Injury Prevention and Control, CDC (producer); 2020. www.cdc.gov/ncipc/wisqars/index.html.
2. Curtin S. State Suicide Rates Among Adolescents and Young Adults Aged 10–24: United States, 2000–2018. Hyattsville, MD: National Center for Health Statistics, 2020.
3. Health care contacts in the year before suicide death;BK Ahmedani;J Gen Intern Med,2014
4. Suicide Prevention Research Priorities in Health Care;JA Gordon;JAMA Psychiatry,2020
5. Improving suicide risk prediction via targeted data fusion: proof of concept using medical claims data;W Xu;J Am Med Inform Assoc,2022