BACKGROUND
Accurate patient outcome prediction in the intensive care unit (ICU) can lead to more effective and efficient patient care. Deep learning models are capable of learning from data to accurately predict patient outcomes, but they typically require large amounts of data and computational resources. Transfer learning (TL) can help in scenarios when data and computational resources are scarce by leveraging pre-trained models. While TL has been widely used in medical imaging and natural language processing, it has been rare in electronic health record (EHR) analysis. Furthermore, domain adaptation (DA) has been the most commonly used TL method in general, whereas inductive transfer learning (ITL) has been rare. To the best of our knowledge, DA and ITL have never been studied in depth in the context of EHR-based ICU patient outcome prediction.
OBJECTIVE
This study investigated DA as well as rarely researched ITL in EHR-based ICU patient outcome prediction under simulated, varying levels of data scarcity.
METHODS
Two patient cohorts were used in this study: 1) eCritical, a multicenter ICU data from 55,689 unique admission records from 48,672 unique patients admitted to 15 medical-surgical ICUs in Alberta, Canada, between March 2013 and December 2019; and 2) MIMIC-III, a single-center, publicly available ICU dataset from Boston, USA, acquired between 2001 and 2012. We compared DA and ITL models with baseline models (without TL) of fully connected neural networks, logistic regression, and lasso regression in the prediction of 30-day mortality, acute kidney injury (AKI), ICU length of stay (ICU_LOS), and hospital length of stay (H_LOS). Random subsets of training data, ranging from 1% to 75%, as well as the full dataset were used to compare the performances of DA and ITL with the baseline models at various levels of data scarcity.
RESULTS
Overall, the ITL models outperformed the baseline models in 55 out of 56 comparisons. The DA models outperformed the baseline models in 45 out of 56 comparisons. ITL resulted in better performance than DA in terms of the number of times and the margin with which it outperformed the baseline models. In 11 out of 16 cases (8 out of 8 for ITL and 3 out of 8 for DA), TL models outperformed baseline models when trained using the 1% data subset.
CONCLUSIONS
TL-based ICU patient outcome prediction models are useful in data-scarce scenarios. The results of the present study can be used to estimate ICU outcome prediction performance at different levels of data scarcity, with and without TL. The publicly available pre-trained models from this study can serve as building blocks in further research for the development and validation of models in other ICU cohorts and outcomes.