Affiliation:
1. Flatiron Health Inc., 233 Spring St., New York, NY 10013, USA
2. Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA
3. Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
Abstract
Our goal was to develop and characterize a Natural Language Processing (NLP) algorithm to extract Eastern Cooperative Oncology Group Performance Status (ECOG PS) from unstructured electronic health record (EHR) sources to enhance observational datasets. By scanning unstructured EHR-derived documents from a real-world database, the NLP algorithm assigned ECOG PS scores to patients diagnosed with one of 21 cancer types who lacked structured ECOG PS numerical scores, anchored to the initiation of treatment lines. Manually abstracted ECOG PS scores were used as a source of truth to both develop the algorithm and evaluate accuracy, sensitivity, and positive predictive value (PPV). Algorithm performance was further characterized by investigating the prognostic value of composite ECOG PS scores in patients with advanced non-small cell lung cancer receiving first line treatment. Of N = 480,825 patient-lines, structured ECOG PS scores were available for 290,343 (60.4%). After applying NLP-extraction, the availability increased to 73.2%. The algorithm’s overall accuracy, sensitivity, and PPV were 93% (95% CI: 92–94%), 88% (95% CI: 87–89%), and 88% (95% CI: 87–89%), respectively across all cancer types. In a cohort of N = 51,948 aNSCLC patients receiving 1L therapy, the algorithm improved ECOG PS completeness from 61.5% to 75.6%. Stratification by ECOG PS showed worse real-world overall survival (rwOS) for patients with worse ECOG PS scores. We developed an NLP algorithm to extract ECOG PS scores from unstructured EHR documents with high accuracy, improving data completeness for EHR-derived oncology cohorts.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献