Evaluation of crowdsourced mortality prediction models as a framework for assessing artificial intelligence in medicine
Author:
Bergquist Timothy12ORCID, Schaffter Thomas1, Yan Yao13, Yu Thomas1, Prosser Justin4, Gao Jifan5, Chen Guanhua5, Charzewski Łukasz67, Nawalany Zofia6, Brugere Ivan8, Retkute Renata9, Prusokas Alidivinas10, Prusokas Augustinas11, Choi Yonghwa12, Lee Sanghoon12, Choe Junseok12, Lee Inggeol13, Kim Sunkyu12ORCID, Kang Jaewoo1213ORCID, Mooney Sean D2ORCID, Guinney Justin12ORCID, Lee Aaron, Salehzadeh-Yazdi Ali, Prusokas Alidivinas, Basu Anand, Belouali Anas, Becker Ann-Kristin, Israel Ariel, Prusokas Augustinas, Winter B, Moreno Carlos Vega, Kurz Christoph, Waltemath Dagmar, Schweinoch Darius, Glaab Enrico, Luo Gang, Chen Guanhua, Zacharias Helena U, Qiao Hezhe, Lee Inggeol, Brugere Ivan, Kang Jaewoo, Gao Jifan, Truthmann Julia, Choe JunSeok, Stephens Kari A, Kaderali Lars, Varshney Lav R, Vollmer Marcus, Pandi Maria-Theodora, Gunn Martin L, Yetisgen Meliha, Nath Neetika, Hammarlund Noah, Müller-Stricker Oliver, Togias Panagiotis, Heagerty Patrick J, Muir Peter, Banda Peter, Retkute Renata, Henkel Ron, Madgi Sagar, Gupta Samir, Lee Sanghoon, Mooney Sean, Kannattikuni Shabeeb, Sarhadi Shamim, Omar Shikhar, Wang Shuo, Ghosh Soumyabrata, Neumann Stefan, Simm Stefan, Madhavan Subha, Kim Sunkyu, Von Yu Thomas, Satagopam Venkata, Pejaver Vikas, Gupta Yachee, Choi Yonghwa, Nawalany Zofia, Charzewski Łukasz, Lee Aaron, Salehzadeh-Yazdi Ali, Prusokas Alidivinas, Basu Anand, Belouali Anas, Becker Ann-Kristin, Israel Ariel, Prusokas Augustinas, Winter B, Moreno Carlos Vega, Kurz Christoph, Waltemath Dagmar, Schweinoch Darius, Glaab Enrico, Luo Gang, Chen Guanhua, Zacharias Helena U, Qiao Hezhe, Lee Inggeol, Brugere Ivan, Kang Jaewoo, Gao Jifan, Truthmann Julia, Choe JunSeok, Stephens Kari A, Kaderali Lars, Varshney Lav R, Vollmer Marcus, Pandi Maria-Theodora, Gunn Martin L, Yetisgen Meliha, Nath Neetika, Hammarlund Noah, Müller-Stricker Oliver, Togias Panagiotis, Heagerty Patrick J, Muir Peter, Banda Peter, Retkute Renata, Henkel Ron, Madgi Sagar, Gupta Samir, Lee Sanghoon, Mooney Sean, Kannattikuni Shabeeb, Sarhadi Shamim, Omar Shikhar, Wang Shuo, Ghosh Soumyabrata, Neumann Stefan, Simm Stefan, Madhavan Subha, Kim Sunkyu, Von Yu Thomas, Satagopam Venkata, Pejaver Vikas, Gupta Yachee, Choi Yonghwa, Nawalany Zofia, Charzewski Łukasz, Lee Aaron, Salehzadeh-Yazdi Ali, Prusokas Alidivinas, Basu Anand, Belouali Anas, Becker Ann-Kristin, Israel Ariel, Prusokas Augustinas, Winter B, Moreno Carlos Vega, Kurz Christoph, Waltemath Dagmar, Schweinoch Darius, Glaab Enrico, Luo Gang, Chen Guanhua, Zacharias Helena U, Qiao Hezhe, Lee Inggeol, Brugere Ivan, Kang Jaewoo, Gao Jifan, Truthmann Julia, Choe JunSeok, Stephens Kari A, Kaderali Lars, Varshney Lav R, Vollmer Marcus, Pandi Maria-Theodora, Gunn Martin L, Yetisgen Meliha, Nath Neetika, Hammarlund Noah, Müller-Stricker Oliver, Togias Panagiotis, Heagerty Patrick J, Muir Peter, Banda Peter, Retkute Renata, Henkel Ron, Madgi Sagar, Gupta Samir, Lee Sanghoon, Mooney Sean, Kannattikuni Shabeeb, Sarhadi Shamim, Omar Shikhar, Wang Shuo, Ghosh Soumyabrata, Neumann Stefan, Simm Stefan, Madhavan Subha, Kim Sunkyu, Von Yu Thomas, Satagopam Venkata, Pejaver Vikas, Gupta Yachee, Choi Yonghwa, Nawalany Zofia, Charzewski Łukasz, Lee Aaron, Salehzadeh-Yazdi Ali, Prusokas Alidivinas, Basu Anand, Belouali Anas, Becker Ann-Kristin, Israel Ariel, Prusokas Augustinas, Winter B, Moreno Carlos Vega, Kurz Christoph, Waltemath Dagmar, Schweinoch Darius, Glaab Enrico, Luo Gang, Chen Guanhua, Zacharias Helena U, Qiao Hezhe, Lee Inggeol, Brugere Ivan, Kang Jaewoo, Gao Jifan, Truthmann Julia, Choe JunSeok, Stephens Kari A, Kaderali Lars, Varshney Lav R, Vollmer Marcus, Pandi Maria-Theodora, Gunn Martin L, Yetisgen Meliha, Nath Neetika, Hammarlund Noah, Müller-Stricker Oliver, Togias Panagiotis, Heagerty Patrick J, Muir Peter, Banda Peter, Retkute Renata, Henkel Ron, Madgi Sagar, Gupta Samir, Lee Sanghoon, Mooney Sean, Kannattikuni Shabeeb, Sarhadi Shamim, Omar Shikhar, Wang Shuo, Ghosh Soumyabrata, Neumann Stefan, Simm Stefan, Madhavan Subha, Kim Sunkyu, Von Yu Thomas, Satagopam Venkata, Pejaver Vikas, Gupta Yachee, Choi Yonghwa, Nawalany Zofia, Charzewski Łukasz, Lee Aaron, Salehzadeh-Yazdi Ali, Prusokas Alidivinas, Basu Anand, Belouali Anas, Becker Ann-Kristin, Israel Ariel, Prusokas Augustinas, Winter B, Moreno Carlos Vega, Kurz Christoph, Waltemath Dagmar, Schweinoch Darius, Glaab Enrico, Luo Gang, Chen Guanhua, Zacharias Helena U, Qiao Hezhe, Lee Inggeol, Brugere Ivan, Kang Jaewoo, Gao Jifan, Truthmann Julia, Choe JunSeok, Stephens Kari A, Kaderali Lars, Varshney Lav R, Vollmer Marcus, Pandi Maria-Theodora, Gunn Martin L, Yetisgen Meliha, Nath Neetika, Hammarlund Noah, Müller-Stricker Oliver, Togias Panagiotis, Heagerty Patrick J, Muir Peter, Banda Peter, Retkute Renata, Henkel Ron, Madgi Sagar, Gupta Samir, Lee Sanghoon, Mooney Sean, Kannattikuni Shabeeb, Sarhadi Shamim, Omar Shikhar, Wang Shuo, Ghosh Soumyabrata, Neumann Stefan, Simm Stefan, Madhavan Subha, Kim Sunkyu, Von Yu Thomas, Satagopam Venkata, Pejaver Vikas, Gupta Yachee, Choi Yonghwa, Nawalany Zofia, Charzewski Łukasz, Lee Aaron, Salehzadeh-Yazdi Ali, Prusokas Alidivinas, Basu Anand, Belouali Anas, Becker Ann-Kristin, Israel Ariel, Prusokas Augustinas, Winter B, Moreno Carlos Vega, Kurz Christoph, Waltemath Dagmar, Schweinoch Darius, Glaab Enrico, Luo Gang, Chen Guanhua, Zacharias Helena U, Qiao Hezhe, Lee Inggeol, Brugere Ivan, Kang Jaewoo, Gao Jifan, Truthmann Julia, Choe JunSeok, Stephens Kari A, Kaderali Lars, Varshney Lav R, Vollmer Marcus, Pandi Maria-Theodora, Gunn Martin L, Yetisgen Meliha, Nath Neetika, Hammarlund Noah, Müller-Stricker Oliver, Togias Panagiotis, Heagerty Patrick J, Muir Peter, Banda Peter, Retkute Renata, Henkel Ron, Madgi Sagar, Gupta Samir, Lee Sanghoon, Mooney Sean, Kannattikuni Shabeeb, Sarhadi Shamim, Omar Shikhar, Wang Shuo, Ghosh Soumyabrata, Neumann Stefan, Simm Stefan, Madhavan Subha, Kim Sunkyu, Von Yu Thomas, Satagopam Venkata, Pejaver Vikas, Gupta Yachee, Choi Yonghwa, Nawalany Zofia, Charzewski Łukasz,
Affiliation:
1. Sage Bionetworks , Seattle, WA, United States 2. Department of Biomedical Informatics and Medical Education, University of Washington , Seattle, WA, United States 3. Molecular Engineering and Sciences Institute, University of Washington , Seattle, WA, United States 4. Institute of Translational Health Sciences, University of Washington , Seattle, WA, United States 5. Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison , Madison, WI, United States 6. Proacta , Warsaw, Poland 7. Division of Biophysics, University of Warsaw , Warsaw, Poland 8. Department of Computer Science, University of Illinois at Chicago , Chicago, IL, United States 9. Department of Plant Sciences, University of Cambridge , Cambridge, United Kingdom 10. Plant and Molecular Sciences, School of Natural and Environmental Sciences, Newcastle University , Newcastle upon Tyne, United Kingdom 11. Department of Life Sciences, Imperial College London , London, United Kingdom 12. Department of Computer Science and Engineering, College of Informatics, Korea University , Seoul, Republic of Korea 13. Department of Interdisciplinary Program in Bioinformatics, College of Informatics, Korea University , Seoul, Republic of Korea
Abstract
Abstract
Objective
Applications of machine learning in healthcare are of high interest and have the potential to improve patient care. Yet, the real-world accuracy of these models in clinical practice and on different patient subpopulations remains unclear. To address these important questions, we hosted a community challenge to evaluate methods that predict healthcare outcomes. We focused on the prediction of all-cause mortality as the community challenge question.
Materials and methods
Using a Model-to-Data framework, 345 registered participants, coalescing into 25 independent teams, spread over 3 continents and 10 countries, generated 25 accurate models all trained on a dataset of over 1.1 million patients and evaluated on patients prospectively collected over a 1-year observation of a large health system.
Results
The top performing team achieved a final area under the receiver operator curve of 0.947 (95% CI, 0.942-0.951) and an area under the precision-recall curve of 0.487 (95% CI, 0.458-0.499) on a prospectively collected patient cohort.
Discussion
Post hoc analysis after the challenge revealed that models differ in accuracy on subpopulations, delineated by race or gender, even when they are trained on the same data.
Conclusion
This is the largest community challenge focused on the evaluation of state-of-the-art machine learning methods in a healthcare system performed to date, revealing both opportunities and pitfalls of clinical AI.
Funder
Clinical and Translational Science Awards Program ational Center for Data to Health National Center for Advancing Translational Sciences National Institutes of Health Institute for Translational Health Sciences
Publisher
Oxford University Press (OUP)
Subject
Health Informatics
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|