Natural Language Processing vs Diagnosis Code-Based Methods for Postherpetic Neuralgia Identification: Development and Validation in Real-World Data (Preprint)

Author:

Zheng ChengyiORCID,Ackerson Bradley,Qiu Sijia,Sy Lina SORCID,Daily Leticia I. Vega,Song Jeannie,Qian LeiORCID,Luo Yi,Ku Jennifer H.,Cheng Yanjun,Wu Jun,Tseng Hung Fu

Abstract

BACKGROUND

Diagnosis codes and prescription data are used in algorithms to identify postherpetic neuralgia (PHN), a debilitating complication of herpes zoster (HZ). Because of the questionable accuracy of codes and prescription data, manual chart review is sometimes used to identify PHN in electronic health records (EHR), which can be costly and time-consuming.

OBJECTIVE

To develop and validate a natural language processing (NLP) algorithm for automatically identifying PHN from unstructured EHR data. To compare its performance with that of code-based methods.

METHODS

This retrospective study used EHR data from Kaiser Permanente Southern California, a large integrated healthcare system that serves over 4.8 million members. The source population included members aged ≥50 years who received an incident HZ diagnosis and accompanying antiviral prescription between 2018-2020 and had ≥1 encounter within 90-180 days of the incident HZ diagnosis. The study team manually reviewed the EHR and identified PHN cases. For NLP development and validation, 500 and 800 random samples from the source population were selected, respectively. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F-score, and Matthews correlation coefficient (MCC) of NLP and the code-based methods were evaluated using chart-reviewed results as the reference standard.

RESULTS

The NLP algorithm identified PHN cases with 90.9% sensitivity, 98.5% specificity, 82.0% PPV, and 99.3% NPV. The composite scores of the NLP algorithm were 0.89 (F-score) and 0.85 (MCC). The prevalences of PHN in the validation data were 6.9% (reference standard), 7.6% (NLP), and 5.4-13.1% (code-based). The code-based methods achieved 52.7-61.8% sensitivity, 89.8-98.4% specificity, 27.6-72.1% PPV, and 96.3-97.1% NPV. The F-scores and MCCs were ranged between 0.45-0.59 and 0.32-0.61, respectively.

CONCLUSIONS

The automated NLP-based approach identified PHN cases from the EHR with good accuracy. This method could be useful in population-based PHN research.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3