BACKGROUND
High Flow Nasal Cannula (HFNC) provides non-invasive respiratory support for critically ill children who may tolerate it more readily than other Non-Invasive (NIV) techniques such as Bilevel Positive Airway Pressure (BiPAP) and Continuous Positive Airway Pressure (CPAP). Moreover, HFNC may preclude the need for mechanical ventilation (intubation). Nevertheless, NIV or intubation may ultimately be necessary for certain patients. Timely prediction of HFNC failure can provide an indication for increasing respiratory support.
OBJECTIVE
This work developed and compared machine learning models to predict HFNC failure.
METHODS
A retrospective study was conducted using the Virtual Pediatric Intensive Care Unit database of Electronic Medical Records (EMR) of patients admitted to a tertiary pediatric ICU from January 2010 to February 2020. Patients <19 years old, without apnea, and receiving HFNC treatment were included. A Long Short-Term Memory (LSTM) model using 517 variables (vital signs, laboratory data and other clinical parameters) was trained to generate a continuous prediction of HFNC failure, defined as escalation to NIV or intubation within 24 hours of HFNC initiation. For comparison, seven other models were trained: a Logistic Regression (LR) using the same 517 variables, another LR using only 14 variables, and five additional LSTM-based models using the same 517 variables as the first LSTM and incorporating additional ML techniques (transfer learning, input perseveration, and ensembling). Performance was assessed using the area under the receiver operating curve (AUROC) at various times following HFNC initiation. The sensitivity, specificity, positive and negative predictive values (PPV, NPV) of predictions at two hours after HFNC initiation were also evaluated. These metrics were also computed in a cohort with primarily respiratory diagnoses.
RESULTS
834 HFNC trials [455 training, 173 validation, 206 test] met the inclusion criteria, of which 175 [103, 30, 42] (21.0%) escalated to NIV or intubation. The LSTM models trained with transfer learning generally performed better than the LR models, with the best LSTM model achieving an AUROC of 0.78, vs 0.66 for the 14-variable LR and 0.71 for the 517-variable LR, two hours after initiation. All models except for the 14-variable LR achieved higher AUROCs in the respiratory cohort than in the general ICU population.
CONCLUSIONS
Machine learning models trained using EMR data were able to identify children at risk for failing HFNC within 24 hours of initiation. LSTM models that incorporated transfer learning, input data perseveration and ensembling showed improved performance than the LR and standard LSTM models.
CLINICALTRIAL