Affiliation:
1. Department of Neurology Massachusetts General Hospital Boston Massachusetts USA
2. Harvard Medical School Boston Massachusetts USA
3. Clinical Data Animation Center, Massachusetts General Hospital Boston Massachusetts USA
4. Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital Boston Massachusetts USA
Abstract
AbstractObjectiveUnstructured data present in electronic health records (EHR) are a rich source of medical information; however, their abstraction is labor intensive. Automated EHR phenotyping (AEP) can reduce the need for manual chart review. We present an AEP model that is designed to automatically identify patients diagnosed with epilepsy.MethodsThe ground truth for model training and evaluation was captured from a combination of structured questionnaires filled out by physicians for a subset of patients and manual chart review using customized software. Modeling features included indicators of the presence of keywords and phrases in unstructured clinical notes, prescriptions for antiseizure medications (ASMs), International Classification of Diseases (ICD) codes for seizures and epilepsy, number of ASMs and epilepsy‐related ICD codes, age, and sex. Data were randomly divided into training (70%) and hold‐out testing (30%) sets, with distinct patients in each set. We trained regularized logistic regression and an extreme gradient boosting models. Model performance was measured using area under the receiver operating curve (AUROC) and area under the precision–recall curve (AUPRC), with 95% confidence intervals (CI) estimated via bootstrapping.ResultsOur study cohort included 3903 adults drawn from outpatient departments of nine hospitals between February 2015 and June 2022 (mean age = 47 ± 18 years, 57% women, 82% White, 84% non‐Hispanic, 70% with epilepsy). The final models included 285 features, including 246 keywords and phrases captured from 8415 encounters. Both models achieved AUROC and AUPRC of 1 (95% CI = .99–1.00) in the hold‐out testing set.SignificanceA machine learning‐based AEP approach accurately identifies patients with epilepsy from notes, ICD codes, and ASMs. This model can enable large‐scale epilepsy research using EHR databases.
Funder
Centers for Disease Control and Prevention
NIH Clinical Center
Subject
Neurology (clinical),Neurology
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献