Potential and limitations of machine meta-learning (ensemble) methods for predicting COVID-19 mortality in a large inhospital Brazilian dataset
-
Published:2023-03-01
Issue:1
Volume:13
Page:
-
ISSN:2045-2322
-
Container-title:Scientific Reports
-
language:en
-
Short-container-title:Sci Rep
Author:
de Paiva Bruno Barbosa MirandaORCID, Pereira Polianna DelfinoORCID, de Andrade Claudio Moisés ValienseORCID, Gomes Virginia Mara ReisORCID, Souza-Silva Maira Viana RegoORCID, Martins Karina Paula Medeiros PradoORCID, Sales Thaís Lorenna SouzaORCID, de Carvalho Rafael Lima RodriguesORCID, Pires Magda CarvalhoORCID, Ramos Lucas Emanuel FerreiraORCID, Silva Rafael TavaresORCID, de Freitas Martins Vieira AlessandraORCID, Nunes Aline Gabrielle SousaORCID, de Oliveira Jorge AlziraORCID, de Oliveira Maurílio AmandaORCID, Scotton Ana Luiza Bahia AlvesORCID, da Silva Carla Thais Candida AlvesORCID, Cimini Christiane Corrêa RodriguesORCID, Ponce DanielaORCID, Pereira Elayne CrestaniORCID, Manenti Euler Roberto FernandesORCID, Rodrigues Fernanda d’AthaydeORCID, Anschau FernandoORCID, Botoni Fernando AntônioORCID, Bartolazzi FredericoORCID, Grizende Genna Maira SantosORCID, Noal Helena CarolinaORCID, Duani HelenaORCID, Gomes Isabela MoraesORCID, Costa Jamille Hemétrio Salles MartinsORCID, di Sabatino Santos Guimarães JúliaORCID, Tupinambás Julia TeixeiraORCID, Rugolo Juliana MachadoORCID, Batista Joanna d’Arc LyraORCID, de Alvarenga Joice CoutinhoORCID, Chatkin José MiguelORCID, Ruschel Karen BrasilORCID, Zandoná Liege BarellaORCID, Pinheiro Lílian SantosORCID, Menezes Luanna Silva MonteiroORCID, de Oliveira Lucas Moyses Carvalho, Kopittke LucianeORCID, Assis Luisa ArgoloORCID, Marques Luiza MargotoORCID, Raposo Magda CesarORCID, Floriani Maiara AnschauORCID, Bicalho Maria Aparecida CamargosORCID, Nogueira Matheus Carvalho AlvesORCID, de Oliveira Neimy RamosORCID, Ziegelmann Patricia KlarmannORCID, Paraiso Pedro GibsonORCID, de Lima Martelli Petrônio JoséORCID, Senger RobertaORCID, Menezes Rochele MosmannORCID, Francisco Saionara CristinaORCID, Araújo Silvia FerreiraORCID, Kurtz TatianaORCID, Fereguetti Tatiani OliveiraORCID, de Oliveira Thainara ConceiçãoORCID, Ribeiro Yara Cristina Neves Marques BarbosaORCID, Ramires Yuri CarlottoORCID, Lima Maria Clara Pontello BarbosaORCID, Carneiro MarceloORCID, Bezerra Adriana Falangola BenjaminORCID, Schwarzbold Alexandre VargasORCID, de Moura Costa André SoaresORCID, Farace Barbara LopesORCID, Silveira Daniel VitorioORCID, de Almeida Cenci Evelin PaolaORCID, Lucas Fernanda BarbosaORCID, Aranha Fernando GraçaORCID, Bastos Gisele Alsina NaderORCID, Vietta Giovanna GrunewaldORCID, Nascimento Guilherme FagundesORCID, Vianna Heloisa ReniersORCID, Guimarães Henrique CerqueiraORCID, de Morais Julia Drumond ParreirasORCID, Moreira Leila BeltramiORCID, de Oliveira Leonardo SeixasORCID, de Deus Sousa LucasORCID, de Souza Viana LucianoORCID, de Souza Cabral Máderson AlvaresORCID, Ferreira Maria Angélica PiresORCID, de Godoy Mariana FrizzoORCID, de Figueiredo Meire PereiraORCID, Guimarães-Junior Milton HenriquesORCID, de Paula de Sordi Mônica AparecidaORCID, da Cunha Severino Sampaio NatáliaORCID, Assaf Pedro LedicORCID, Lutkmeier RaquelORCID, Valacio Reginaldo AparecidoORCID, Finger Renan GoulartORCID, de Freitas RufinoORCID, Guimarães Silvana Mangeon MeirellesORCID, Oliveira Talita FischerORCID, Diniz Thulio Henrique OliveiraORCID, Gonçalves Marcos AndréORCID, Marcolino Milena SorianoORCID
Abstract
AbstractThe majority of early prediction scores and methods to predict COVID-19 mortality are bound by methodological flaws and technological limitations (e.g., the use of a single prediction model). Our aim is to provide a thorough comparative study that tackles those methodological issues, considering multiple techniques to build mortality prediction models, including modern machine learning (neural) algorithms and traditional statistical techniques, as well as meta-learning (ensemble) approaches. This study used a dataset from a multicenter cohort of 10,897 adult Brazilian COVID-19 patients, admitted from March/2020 to November/2021, including patients [median age 60 (interquartile range 48–71), 46% women]. We also proposed new original population-based meta-features that have not been devised in the literature. Stacking has shown to achieve the best results reported in the literature for the death prediction task, improving over previous state-of-the-art by more than 46% in Recall for predicting death, with AUROC 0.826 and MacroF1 of 65.4%. The newly proposed meta-features were highly discriminative of death, but fell short in producing large improvements in final prediction performance, demonstrating that we are possibly on the limits of the prediction capabilities that can be achieved with the current set of ML techniques and (meta-)features. Finally, we investigated how the trained models perform on different hospitals, showing that there are indeed large differences in classifier performance between different hospitals, further making the case that errors are produced by factors that cannot be modeled with the current predictors.
Publisher
Springer Science and Business Media LLC
Subject
Multidisciplinary
Reference42 articles.
1. Dong, E., Du, H. & Gardner, L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. https://doi.org/10.1016/S1473-3099(20)30120-1 (2020). 2. Callaway, E. Could new COVID variants undermine vaccines? Labs scramble to find out. Nature 589(7841), 177–178 (2021). 3. Fumagalli, C. et al. Clinical risk score to predict in-hospital mortality in COVID-19 patients: A retrospective cohort study. BMJ Open 10(9), e040729 (2020). 4. Bertsimas, D. et al. COVID-19 mortality risk assessment: An international multi-center study. PLoS ONE 15(12), e0243262 (2020). 5. Lee, J. Y. et al. A risk scoring system to predict progression to severe pneumonia in patients with Covid-19. Sci. Rep. 12(1), 5390. https://doi.org/10.1038/s41598-022-07610-9 (2022).
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|