Evaluation of crowdsourced mortality prediction models as a framework for assessing AI in medicine-Reference-Cited by-同舟云学术

Evaluation of crowdsourced mortality prediction models as a framework for assessing AI in medicine

Published:2021-01-20 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Bergquist Timothy,Schaffter Thomas,Yan Yao,Yu Thomas,Prosser Justin,Gao Jifan,Chen Guanhua,Charzewski Łukasz,Nawalany Zofia,Brugere Ivan,Retkute Renata,Prusokas Alidivinas,Prusokas Augustinas,Choi Yonghwa,Lee Sanghoon,Choe Junseok,Lee Inggeol,Kim Sunkyu,Kang Jaewoo,Mooney Sean D.,Guinney Justin^ORCID,

Abstract

AbstractApplications of machine learning in healthcare are of high interest and have the potential to significantly improve patient care. Yet, the real-world accuracy and performance of these models on different patient subpopulations remains unclear. To address these important questions, we hosted a community challenge to evaluate different methods that predict healthcare outcomes. To overcome patient privacy concerns, we employed a Model-to-Data approach, allowing citizen scientists and researchers to train and evaluate machine learning models on private health data without direct access to that data. We focused on the prediction of all-cause mortality as the community challenge question. In total, we had 345 registered participants, coalescing into 25 independent teams, spread over 3 continents and 10 countries. The top performing team achieved a final area under the receiver operator curve of 0.947 (95% CI 0.942, 0.951) and an area under the precision-recall curve of 0.487 (95% CI 0.458, 0.499) on patients prospectively collected over a one year observation of a large health system. Post-hoc analysis after the challenge revealed that models differ in accuracy on subpopulations, delineated by race or gender, even when they are trained on the same data and have similar accuracy on the population. This is the largest community challenge focused on the evaluation of state-of-the-art machine learning methods in a healthcare system performed to date, revealing both opportunities and pitfalls of clinical AI.

Publisher

Cold Spring Harbor Laboratory

Reference30 articles.

1. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review

2. Risk prediction of delirium in hospitalized patients using machine learning: An implementation and prospective evaluation study;J. Am. Med. Inform. Assoc,2020

3. The self‐assessment trap: can we all be better than average?

4. Decaying relevance of clinical data towards future decisions in data-driven inpatient clinical order sets

5. Racial Treatment Disparities after Machine Learning Surgical-Appropriateness Adjustment

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Automated stratification of trauma injury severity across multiple body regions using multi-modal, multi-class machine learning models;2024-01-22

2. A Multifaceted benchmarking of synthetic electronic health record generation models;Nature Communications;2022-12-09

3. The State of Machine Learning in Outcomes Prediction of Transsphenoidal Surgery: A Systematic Review;Journal of Neurological Surgery Part B: Skull Base;2022-09-12

4. A Survey on Big Data Application for Modality and Physiological Signal Analysis;Advances in Intelligent Systems and Technologies;2022-07-30

5. A Continuously Benchmarked and Crowdsourced Challenge for Rapid Development and Evaluation of Models to Predict COVID-19 Diagnosis and Hospitalization;JAMA Network Open;2021-10-11