Accuracy and generalizability of machine learning models for adolescent suicide prediction with longitudinal clinical records

Author:

Zang Chengxi1,Hou Yu1,Jin Jun,Sacco Shane,Chen Kun2ORCID,Aseltine Robert,Wang Fei1ORCID

Affiliation:

1. Weill Cornell Medicine

2. University of Connecticut

Abstract

Abstract Machine Learning (ML) models trained from real-world data (RWD) have demonstrated promise in predicting suicide attempts in adolescents. However, their cross-data performance and transportability for suicide prediction are largely unknown, hindering the clinical adoption of these ML models. We developed different ML suicide prediction models based on RWD collected in different contexts (inpatient, outpatient, etc.) with varying purposes (e.g., administrative claims and electronic health records), compare their cross-data performance, and evaluate their transportability. The data used was from the All-Payer Claims Database (APCD) and Hospital Inpatient Discharge Database (HIDD) in Connecticut as well as Electronic Health Records (EHR) data provided by Kansas Health Information Network (KHIN). From three datasets collected in different settings (inpatient, outpatient, etc.) and purposes (e.g., administrative claims and electronic health records), we included 285,320 patients among whom we identified 3389 (1.2%) suicide attempters. 66.0% of the suicide attempters were female. Different machine learning models (regularized logistic regression, gradient boosting machine, long-short term memory neural network) were evaluated on the local datasets and transported datasets. Significant and comparable decreases in the transfer performance compared to the local performance of all three ML models were observed, with the decline in performance reducing average AUC by up to 7.7%, reducing average sensitivity by up to 16%, and PPV by up to 2% at the 90% specificity level, and reducing sensitivity by up to 20% and PPV by up to 5% at the 95% specificity level. The similarity of behavior across these modeling approaches strengthens the validity of our results. The commonality and heterogeneity of predictors learned across populations were compared. These results indicate that no matter how well ML suicide models performed with their source data, their performance when transporting these models to new datasets is limited. However, the transported models did identify additional new cases. Our analyses could facilitate the development of suicide prediction models with better performance and generalizability.

Publisher

Research Square Platform LLC

Reference35 articles.

1. State Suicide Rates Among Adolescents and Young Adults Aged 10–24: United States, 2000–2018;Curtin SC;Natl Vital Stat Rep,2020

2. Leading Causes of Death and Injury - PDFs|Injury Center|CDC. https://www.cdc.gov/injury/wisqars/LeadingCauses.html (2022).

3. A Review of the Evidence | American;Contact With Mental Health and Primary Care Providers Before Suicide

4. Health Care Contacts in the Year Before Suicide Death;Ahmedani BK;J Gen Intern Med,2014

5. Machine learning for suicide risk prediction in children and adolescents with electronic health records;Su C;Transl Psychiatry,2020

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3