Author:
Walker Andrew L.,Watson Cheri,Butcher Ryan,Abedin Zameer,Yandell Mark,Shah Rashmee U.
Abstract
AbstractBackgroundReal-world evidence derived from the electronic medical record (EMR) is increasingly prevalent. How best to ascertain cardiovascular outcomes from EMRs is unknown. We sought to validate a commercially available natural language processing (NLP) software to extract bleeding events.MethodsWe included patients with atrial fibrillation and cancer seen at our cancer center from 1/1/2016 to 12/31/2019. A query set based on SNOMED CT expressions was created to represent bleeding from 11 different organ systems. We ran the query against the clinical notes and randomly selected a sample of notes for physician validation. The primary outcome was the positive predictive value (PPV) of the software to identify bleeding events stratified by organ system.ResultsWe included 1370 patients with mean age 72 years old (SD 1.5) and 35% female. We processed 66,130 notes; the NLP software identified 6522 notes including 654 unique patients with possible bleeding events. Among 1269 randomly selected notes, the PPV of the software ranged from 0.921 for neurologic bleeds to 0.571 for OB/GYN bleeds. Patterns related to false positive bleeding events identified by the software included historic bleeds, hypothetical bleeds, missed negatives, and word errors.ConclusionsNLP may provide an alternative for population-level screening for bleeding outcomes in cardiovascular studies. Human validation is still needed, but an NLP-driven screening approach may improve efficiency.
Publisher
Cold Spring Harbor Laboratory
Reference8 articles.
1. Impact of Different Electronic Cohort Definitions to Identify Patients With Atrial Fibrillation From the Electronic Medical Record;J. Am. Heart Assoc,2020
2. Comparability of Event Adjudication Versus Administrative Billing Claims for Outcome Ascertainment in the DAPT Study;Circ. Cardiovasc. Qual. Outcomes,2021
3. Office of the Commissioner. Real-World Evidence. U.S. Food and Drug Administration https://www.fda.gov/science-research/science-and-research-special-topics/real-world-evidence (2020).
4. The Promise of Big Data and Digital Solutions in Building a Cardiovascular Learning System: Opportunities and Barriers;Methodist Debakey Cardiovasc. J,2020
5. RESEARCH PROTOCOL: Large-scale evidence generation and evaluation across a network of databases for type 2 diabetes mellitus