Identification of Patients With Metastatic Prostate Cancer With Natural Language Processing and Machine Learning-Reference-Cited by-同舟云学术

Identification of Patients With Metastatic Prostate Cancer With Natural Language Processing and Machine Learning

Published:2022-12 Issue:6 Volume: Page:
ISSN:2473-4276
Container-title:JCO Clinical Cancer Informatics
language:en
Short-container-title:JCO Clinical Cancer Informatics

Author:

Yang Ruixin¹^ORCID,Zhu Di¹,Howard Lauren E.¹²^ORCID,De Hoedt Amanda¹^ORCID,Williams Stephen B.³^ORCID,Freedland Stephen J.¹⁴⁵^ORCID,Klaassen Zachary⁶⁷^ORCID

Affiliation:

1. Urology Section, Department of Surgery, Veterans Affairs Health Care System, Durham, NC

2. Duke Cancer Institute, Duke University School of Medicine, Durham, NC

3. Division of Urology, Department of Surgery, The University of Texas Medical Branch, Galveston, TX

4. Division of Urology, Department of Surgery, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA

5. Center for Integrated Research in Cancer and Lifestyle, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA

6. Division of Urology, Medical College of Georgia at Augusta University, Augusta, GA

7. Georgia Cancer Center, Augusta, GA

Abstract

PURPOSE Understanding treatment patterns and effectiveness for patients with metastatic prostate cancer (mPCa) is dependent on accurate assessment of metastatic status. The objective was to develop a natural language processing (NLP) model for identifying patients with mPCa and evaluate the model's performance against chart-reviewed data and an International Classification of Diseases (ICD) 9/10 code–based method. METHODS In total, 139,057 radiology reports on 6,211 unique patients from the Department of Veterans Affairs were used. The gold standard was metastases by detailed chart review of radiology reports. NLP performance was assessed by sensitivity, specificity, positive predictive value, negative predictive value, and date of metastases detection. Receiver operating characteristic curves was used to assess model performance. RESULTS When compared with chart review, the NLP model had high sensitivity and specificity (85% and 96%, respectively). The NLP model was able to predict patient-level metastasis status with a sensitivity of 91% and specificity of 81%, whereas sensitivity and specificity using ICD9/10 billing codes were 73% and 86%, respectively. For the NLP model, date of metastases detection was exactly concordant and within < 1 week in 55% and 58% of patients, compared with 8% and 17%, respectively, using the ICD9/10 billing codes method. The area under the curve for the NLP model was 0.911. A limitation is the NLP model was developed on the basis of a subset of patients with mPCa and may not be generalizable to all patients with mPCa. CONCLUSION This population-level NLP model for identifying patients with mPCa was more accurate than using ICD9/10 billing codes when compared with chart-reviewed data. Upon further validation, this model may allow for efficient population-level identification of patients with mPCa.

Publisher

American Society of Clinical Oncology (ASCO)

Subject

General Medicine

Link

https://ascopubs.org/doi/pdfdirect/10.1200/CCI.21.00071

Reference21 articles.

1. Secondary use of electronic medical records for clinical research: challenges and opportunities

2. Electronic Health Record Adoption In US Hospitals: Progress Continues, But Challenges Persist

3. Measuring Diagnoses: ICD Code Accuracy

4. Effective factors on accuracy of principal diagnosis coding based on International Classification of Diseases, the 10th revision (ICD-10)

5. Accuracy of ICD-9-CM codes in hospital morbidity data, Victoria: implications for public health research

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Applying Natural Language Processing to Single-Report Prediction of Metastatic Disease Response Using the OR-RADS Lexicon;Cancers;2023-10-10

2. Approach to machine learning for extraction of real-world data variables from electronic health records;Frontiers in Pharmacology;2023-09-15

3. Erratum;JCO Clinical Cancer Informatics;2023-01