Ascertainment of Veterans With Metastatic Prostate Cancer in Electronic Health Records: Demonstrating the Case for Natural Language Processing

Author:

Alba Patrick R.12ORCID,Gao Anthony12ORCID,Lee Kyung Min12,Anglin-Foote Tori12ORCID,Robison Brian12,Katsoulakis Evangelia3ORCID,Rose Brent S.45,Efimova Olga12,Ferraro Jeffrey P.12ORCID,Patterson Olga V.12ORCID,Shelton Jeremy B.67ORCID,Duvall Scott L.12,Lynch Julie A.128ORCID

Affiliation:

1. VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT

2. Division of Epidemiology, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT

3. Department of Radiation Oncology, James A. Haley Veterans Affairs Healthcare System, Tampa, FL

4. VA San Diego Health Care System, La Jolla, CA

5. Division of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, CA

6. VA Greater Los Angeles Healthcare System, Los Angeles, CA

7. University of California, Los Angeles School of Medicine, Los Angeles, CA

8. Department of Nursing and Health Sciences, University of Massachusetts, Boston, Boston, MA

Abstract

PURPOSE Prostate cancer (PCa) is among the leading causes of cancer deaths. While localized PCa has a 5-year survival rate approaching 100%, this rate drops to 31% for metastatic prostate cancer (mPCa). Thus, timely identification of mPCa is a crucial step toward measuring and improving access to innovations that reduce PCa mortality. Yet, methods to identify patients diagnosed with mPCa remain elusive. Cancer registries provide detailed data at diagnosis but are not updated throughout treatment. This study reports on the development and validation of a natural language processing (NLP) algorithm deployed on oncology, urology, and radiology clinical notes to identify patients with a diagnosis or history of mPCa in the Department of Veterans Affairs. PATIENTS AND METHODS Using a broad set of diagnosis and histology codes, the Veterans Affairs Corporate Data Warehouse was queried to identify all Veterans with PCa. An NLP algorithm was developed to identify patients with any history or progression of mPCa. The NLP algorithm was prototyped and developed iteratively using patient notes, grouped into development, training, and validation subsets. RESULTS A total of 1,144,610 Veterans were diagnosed with PCa between January 2000 and October 2020, among which 76,082 (6.6%) were identified by NLP as having mPCa at some point during their care. The NLP system performed with a specificity of 0.979 and sensitivity of 0.919. CONCLUSION Clinical documentation of mPCa is highly reliable. NLP can be leveraged to improve PCa data. When compared to other methods, NLP identified a significantly greater number of patients. NLP can be used to augment cancer registry data, facilitate research inquiries, and identify patients who may benefit from innovations in mPCa treatment.

Publisher

American Society of Clinical Oncology (ASCO)

Subject

General Medicine

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3