Selective prediction for extracting unstructured clinical data

Author:

Swaminathan Akshay12ORCID,Lopez Ivan12,Wang William34ORCID,Srivastava Ujwal5,Tran Edward56,Bhargava-Shah Aarohi1,Wu Janet Y1,Ren Alexander L1,Caoili Kaitlin1,Bui Brandon7,Alkhani Layth48,Lee Susan5,Mohit Nathan57,Seo Noel9,Macedo Nicholas310,Cheng Winson58,Liu Charles11,Thomas Reena12,Chen Jonathan H13141516,Gevaert Olivier1316ORCID

Affiliation:

1. Stanford University School of Medicine , Stanford, CA, United States

2. Cerebral Inc. Claymont, DE, United States

3. Department of Biology, Stanford University , Stanford, CA, United States

4. Department of Bioengineering, Stanford University , Stanford, CA, United States

5. Department of Computer Science, Stanford University , Stanford, CA, United States

6. Department of Management Science and Engineering, Stanford University , Stanford, CA, United States

7. Department of Human Biology, Stanford University , Stanford, CA, United States

8. Department of Chemistry, Stanford University , Stanford, CA, United States

9. Department of Sociology, Stanford University , Stanford, CA, United States

10. Department of Radiology, Stanford University School of Medicine , Stanford, CA, United States

11. Department of Surgery, Stanford University School of Medicine , Stanford, CA, United States

12. Department of Neurology and Neurological Sciences, Stanford Health Care , Stanford, CA, United States

13. Stanford Center for Biomedical Informatics Research , Stanford, CA, United States

14. Division of Hospital Medicine , Stanford, CA, United States

15. Clinical Excellence Research Center , Stanford, CA, United States

16. Department of Medicine , Stanford, CA, United States

Abstract

Abstract Objective While there are currently approaches to handle unstructured clinical data, such as manual abstraction and structured proxy variables, these methods may be time-consuming, not scalable, and imprecise. This article aims to determine whether selective prediction, which gives a model the option to abstain from generating a prediction, can improve the accuracy and efficiency of unstructured clinical data abstraction. Materials and Methods We trained selective classifiers (logistic regression, random forest, support vector machine) to extract 5 variables from clinical notes: depression (n = 1563), glioblastoma (GBM, n = 659), rectal adenocarcinoma (DRA, n = 601), and abdominoperineal resection (APR, n = 601) and low anterior resection (LAR, n = 601) of adenocarcinoma. We varied the cost of false positives (FP), false negatives (FN), and abstained notes and measured total misclassification cost. Results The depression selective classifiers abstained on anywhere from 0% to 97% of notes, and the change in total misclassification cost ranged from −58% to 9%. Selective classifiers abstained on 5%–43% of notes across the GBM and colorectal cancer models. The GBM selective classifier abstained on 43% of notes, which led to improvements in sensitivity (0.94 to 0.96), specificity (0.79 to 0.96), PPV (0.89 to 0.98), and NPV (0.88 to 0.91) when compared to a non-selective classifier and when compared to structured proxy variables. Discussion We showed that selective classifiers outperformed both non-selective classifiers and structured proxy variables for extracting data from unstructured clinical notes. Conclusion Selective prediction should be considered when abstaining is preferable to making an incorrect prediction.

Funder

National Institute of Drug Abuse Clinical Trials Network, Tuolc Inc, Roche Inc

Publisher

Oxford University Press (OUP)

Subject

Health Informatics

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3