Artificial Intelligence-Based Differential Diagnosis: Development and Validation of a Probabilistic Model to Address Lack of Large-Scale Clinical Datasets

Author:

Chishti ShahrukhORCID,Jaggi Karan RajORCID,Saini AnujORCID,Agarwal GauravORCID,Ranjan AshishORCID

Abstract

Background Machine-learning or deep-learning algorithms for clinical diagnosis are inherently dependent on the availability of large-scale clinical datasets. Lack of such datasets and inherent problems such as overfitting often necessitate the development of innovative solutions. Probabilistic modeling closely mimics the rationale behind clinical diagnosis and represents a unique solution. Objective The aim of this study was to develop and validate a probabilistic model for differential diagnosis in different medical domains. Methods Numerical values of symptom-disease associations were utilized to mathematically represent medical domain knowledge. These values served as the core engine for the probabilistic model. For the given set of symptoms, the model was utilized to produce a ranked list of differential diagnoses, which was compared to the differential diagnosis constructed by a physician in a consult. Practicing medical specialists were integral in the development and validation of this model. Clinical vignettes (patient case studies) were utilized to compare the accuracy of doctors and the model against the assumed gold standard. The accuracy analysis was carried out over the following metrics: top 3 accuracy, precision, and recall. Results The model demonstrated a statistically significant improvement (P=.002) in diagnostic accuracy (85%) as compared to the doctors’ performance (67%). This advantage was retained across all three categories of clinical vignettes: 100% vs 82% (P<.001) for highly specific disease presentation, 83% vs 65% for moderately specific disease presentation (P=.005), and 72% vs 49% (P<.001) for nonspecific disease presentation. The model performed slightly better than the doctors’ average in precision (62% vs 60%, P=.43) but there was no improvement with respect to recall (53% vs 56%, P=.27). However, neither difference was statistically significant. Conclusions The present study demonstrates a drastic improvement over previously reported results that can be attributed to the development of a stable probabilistic framework utilizing symptom-disease associations to mathematically represent medical domain knowledge. The current iteration relies on static, manually curated values for calculating the degree of association. Shifting to real-world data–derived values represents the next step in model development.

Publisher

JMIR Publications Inc.

Subject

Health Informatics

Reference20 articles.

1. Global Health Observatory (GHO) Data20192019-12-16World Health OrganizationDensity of physicians (total number per 1000 population), latest available yearhttps://www.who.int/gho/health_workforce/physicians_density_text/en/

2. WHO Global Health Workforce Statistics2019-12-16World Health Organizationhttp://www.who.int/hrh/statistics/hwfstats/

3. Third Global Forum on Human Resources for Health Report20132019-12-16A Universal Truth: No Health Without a Workforcehttps://www.who.int/workforcealliance/knowledge/resources/GHWA-a_universal_truth_report.pdf?ua=1

4. JoyceJStanford Encyclopedia of Philosophy2019-12-16Bayes’ Theoremhttps://plato.stanford.edu/entries/bayes-theorem/

Cited by 13 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3