BACKGROUND
The quality of a machine learning model considerably relies on the size of the dataset, the development and widespread application of this method have often been hindered by confidentiality issues, particularly regarding data privacy. Predicting mortality is essential in clinical environments. When a patient is admitted, estimating their likelihood of mortality by the end of their intensive care unit (ICU) stay or within a designated time frame is a way to assess the severity of their condition. This information is crucial in managing treatment planning and resource allocation. However, individual hospitals typically have a limited amount of local data available to create a reliable model. The rise of federated learning as a novel privacy-preserving technology offers the potential for collaboratively creating models in a decentralized manner, eliminating the need to consolidate all datasets in a single location. Nonetheless, there is a scarce of clear and comprehensive evidence that compares the performance of federated learning with that of traditional centralized machine learning approaches, particularly considering healthcare implementation.
OBJECTIVE
This study aims to review the comparison of performances between federated learning (FL)-based and centralized machine learning (CML) models for mortality prediction in clinical settings.
METHODS
The electronic database search was conducted for English articles that developed federated-based learning model to predict mortality. Screening, data extraction, and risk of bias assessments were carried out by at least two independent reviewers. Meta-analyses of pooled area under the receiver operating curve (AUROC/AUC) values were examined for FL, CML, and LML. The risk of bias was assessed using critical appraisal and data extraction for systematic reviews of prediction modeling studies (CHARMS) and prediction model risk of bias assessment tool (PROBAST) guidelines
RESULTS
In total, 9 articles that were heterogeneous in framework design, scenario, and clinical context were included (n = 5 [55.6%] were observed in specific case; n = 3 [33.0%] were in ICU settings; and n = 2 [22.0%] in emergency department, urgent, or trauma center). Cohort datasets were utilized by all included studies. These studies universally indicated that performance of FL model outperforms LML model and closest to the CML model. The pooled AUC for FL and, CML (or LML) performances were 0.81 (95 % CI 0.76–0.85, I2 78.36 %) and 0.82 (95 % CI 0.77–0.86, I2 72.33 %), respectively. All included studies had either a low, high, or unclear risk of bias.
CONCLUSIONS
This systematic review and meta-analysis demonstrate that federated learning models outperform local machine learning approaches and are comparable to centralized models. However, efficiency may be compromised due to complexity, privacy preservation, and high computation and communication costs.
CLINICALTRIAL
PROSPERO International Prospective Register of Systematic Reviews CRD42024539245; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=539245