Combination of machine learning algorithms with natural language processing may increase the probability of bacteremia detection in the emergency department: A retrospective, big-data analysis of 94,482 patients

Author:

Ben-Haim Gal123,Yosef Mika3,Rowand Eyade24,Ben-Yosef Jonathan5,Berman Aya6,Sina Sigal3,Halabi Nitsan3,Grossbard Eitan7,Marziano Yehonatan8ORCID,Segal Gad24ORCID

Affiliation:

1. Emergency Department, Chaim Sheba Medical Center, Ramat-Gan, Israel

2. The Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel

3. ARC, Innovation Center, Chaim Sheba Medical Center, Ramat Gan, Israel

4. Education Authority, Chaim Sheba Medical Center, Ramat-Gan, Israel

5. Ort Melton High School, Bat-Yam, Israel

6. Dan Petah-Tikvah District, Clalit Health Services, Dan, Israel

7. Kaplan Medical Center, St George's University of London, program delivered by University of Nicosia at the Chaim Sheba Medical Center, Ramat-Gan, Israel

8. Barzilai Medical Center. St George's University of London, program delivered by University of Nicosia at the Chaim Sheba Medical Center, Ramat-Gan, Israel

Abstract

Background Prompt diagnosis of bacteremia in the emergency department (ED) is of utmost importance. Nevertheless, the average time to first clinical laboratory finding range from 1 to 3 days. Alongside a myriad of scoring systems for occult bacteremia prediction, efforts for applying artificial intelligence (AI) in this realm are still preliminary. In the current study we combined an AI algorithm with a Natural Language Processing (NLP) algorithm that would potentially increase the yield extracted from clinical ED data. Methods This study involved adult patients who visited our emergency department and at least one blood culture was taken to rule out bacteremia. Using both tabular and free text data, we built an ensemble model that leverages XGBoost for structured data, and logistic regression (LR) on a word-analysis technique called bag-of-words (BOW) Term Frequency-Inverse Document Frequency (TF-IDF), for textual data. All algorithms were designed in order to predict the risk for bacteremia with ED patients whose blood cultures were sent to the laboratory. Results The study cohort comprised 94,482 individuals, of whom 52% were males. The prevalence of bacteremia in the entire cohort was 9.7%. The model trained on the tabular data yielded an area under the curve (AUC) of 73.7% for XGBoost, while the LR that was trained on the free text achieved an AUC of 71.3%. After checking a range of weights, the best combination was for 55% weight on the XGBoost prediction and 45% weight on the LR prediction. The final model prediction yielded an AUC of 75.6%. Conclusion Harnessing artificial intelligence to the task of bacteremia surveillance in the ED settings by a combination of both free text and tabular data analysis improved predictive performance compared to using tabular data alone. We recommend that future AI applications based on our findings should be assimilated into the clinical routines of ED physicians.

Publisher

SAGE Publications

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3