Scalable Incident Detection via Natural Language Processing and Probabilistic Language Models-Reference-Cited by-同舟云学术

Scalable Incident Detection via Natural Language Processing and Probabilistic Language Models

Published:2023-12-01 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Walsh Colin G.^ORCID,Wilimitis Drew,Chen Qingxia,Wright Aileen,Kolli Jhansi,Robinson Katelyn,Ripperger Michael A.,Johnson Kevin B.,Carrell David,Desai Rishi J.,Mosholder Andrew,Dharmarajan Sai,Adimadhyam Sruthi,Fabbri Daniel,Stojanovic Danijela,Matheny Michael E.,Bejan Cosmin A.

Abstract

AbstractPost marketing safety surveillance depends in part on the ability to detect concerning clinical events at scale. Spontaneous reporting might be an effective component of safety surveillance, but it requires awareness and understanding among healthcare professionals to achieve its potential. Reliance on readily available structured data such as diagnostic codes risk under-coding and imprecision. Clinical textual data might bridge these gaps, and natural language processing (NLP) has been shown to aid in scalable phenotyping across healthcare records in multiple clinical domains. In this study, we developed and validated a novel incident phenotyping approach using unstructured clinical textual data agnostic to Electronic Health Record (EHR) and note type. It’s based on a published, validated approach (PheRe) used to ascertain social determinants of health and suicidality across entire healthcare records. To demonstrate generalizability, we validated this approach on two separate phenotypes that share common challenges with respect to accurate ascertainment: 1) suicide attempt; 2) sleep-related behaviors. With samples of 89,428 records and 35,863 records for suicide attempt and sleep-related behaviors, respectively, we conducted silver standard (diagnostic coding) and gold standard (manual chart review) validation. We showed Area Under the Precision-Recall Curve of ∼ 0.77 (95% CI 0.75-0.78) for suicide attempt and AUPR ∼ 0.31 (95% CI 0.28-0.34) for sleep-related behaviors. We also evaluated performance by coded race and demonstrated differences in performance by race were dissimilar across phenotypes and require algorithmovigilance and debiasing prior to implementation.

Publisher

Cold Spring Harbor Laboratory

Reference43 articles.

1. The FDA's sentinel initiative-A comprehensive approach to medical product surveillance

2. Developing the Sentinel System — A National Resource for Evidence Development

3. The US Food and Drug Administration's Sentinel Initiative: Expanding the horizons of medical product safety

4. The FDA Sentinel Initiative — An Evolving National Resource

5. Using Electronic Health Records to Identify Adverse Drug Events in Ambulatory Care: A Systematic Review;Appl. Clin. Inform,2019

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhancing Postmarketing Surveillance of Medical Products With Large Language Models;JAMA Network Open;2024-08-16