Context-Based Identification of Muscle Invasion Status in Patients With Bladder Cancer Using Natural Language Processing

Author:

Yang Ruixin1ORCID,Zhu Di1,Howard Lauren E.12ORCID,De Hoedt Amanda1ORCID,Schroeck Florian R.34ORCID,Klaassen Zachary56ORCID,Freedland Stephen J.178ORCID,Williams Stephen B.19ORCID

Affiliation:

1. Urology Section, Department of Surgery, Veterans Affairs Health Care System, Durham, NC

2. Duke Cancer Institute, Duke University School of Medicine, Durham, NC

3. White River Junction VA Medical Center, White River Junction, VT

4. The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth College, Lebanon, NH

5. Division of Urology, Medical College of Georgia at Augusta University, Augusta, GA

6. Georgia Cancer Center, Augusta, GA

7. Division of Urology, Department of Surgery, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA

8. Center for Integrated Research in Cancer and Lifestyle, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA

9. Department of Surgery, Division of Urology, The University of Texas Medical Branch at Galveston, Galveston, TX

Abstract

PURPOSE Mortality from bladder cancer (BC) increases exponentially once it invades the muscle, with inherent challenges delineating at the population level. We sought to develop and validate a natural language processing (NLP) model for automatically identifying patients with muscle-invasive bladder cancer (MIBC). METHODS All patients with a Current Procedural Terminology code for transurethral resection of bladder tumor (TURBT; n = 76,060) were selected from the Department of Veterans Affairs (VA) database. A sample of 600 patients (with 2,337 full-text notes) who had TURBT and confirmed pathology results were selected for NLP model development and validation. The NLP performance was assessed by calculating the sensitivity, specificity, positive predictive value, negative predictive value, F1 score, and overall accuracy at the individual note and patient levels. RESULTS In the validation cohort, the NLP model had average overall accuracies of 94% and 96% at the note and patient levels. Specifically, the F1 score and overall accuracy for predicting muscle invasion at the patient level were 0.87% and 96%, respectively. The model classified nonmuscle-invasive bladder cancer (NMIBC) with overall accuracies of 90% and 93% at the note and patient levels. When applying the model to 71,200 patients VA-wide, the model classified 13,642 (19%) as having MIBC and 47,595 (66%) as NMIBC and was able to identify invasion status for 96% of patients with TURBT at the population level. Inherent limitations include a relatively small training set, given the size of the VA population. CONCLUSION This NLP model, with high accuracy, may be a practical tool for efficiently identifying BC invasion status and aid in population-based BC research.

Publisher

American Society of Clinical Oncology (ASCO)

Subject

General Medicine

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3