BACKGROUND
Postoperative venous thromboembolic events (VTE), encompassing deep vein thrombosis (DVT) and pulmonary embolism (PE), are preventable but can cause severe morbidity and mortality in post-surgical patients. Around 1 in 1000 individuals annually experience VTE, making prompt diagnosis and therapy vital, as untreated VTE carries a 30% mortality rate. These events result in $1 billion in hospital costs every year. While prophylaxis for preventing VTE is crucial, some patients may face challenges that limit their ability to take prophylactic measures, such as the risk of bleeding, allergies or adverse reactions, recent surgery or trauma, severe liver or kidney disease, and drug interactions. Of the patients who develop VTE after surgery, around 40% occur during the initial hospital stay , whereas 60% of the patients develop VTE within 30-90 days after discharge from the hospital following surgery. Existing VTE risk scoring systems, such as Caprini and Rogers sub-optimal and can be cumbersome and difficult to translate to patient-specific clinical decision-making. Though machine learning approaches have been used to predict VTE risk in specific surgery types, no validated predictive models are currently used in clinical practice for VTE risk calculation under the broader umbrella of post-operative surgery patients.
OBJECTIVE
The study seeks to use both the static and the time-varying EHR patient data (aggregated per visit) to
1) predict the VTE risk for a surgical patient within 30 days post-discharge using machine learning (ML); 2) stratify the VTE risk of the patient into high, medium, and low-risk categories; and 3) compare the performance of individual department-specific ML models to that of a unified model 4) determine which variables are most strongly associated with postoperative VTE
METHODS
The structured EHR data from post-operative patients from our multi-center hospital system between 2013 and 2019 were used. Various machine learning algorithms, including linear regression with L1/L2 regularization, random forest, and eXtreme Gradient Boosting (XGBoost), were evaluated. Different output probability thresholds were chosen to stratify the patients into low, medium, and high-risk categories. Department-specific ML models were developed, and their performance was compared with the unified model to determine the viability of constructing individual models versus unified models. Feature importance techniques were subsequently applied to identify the most influential features, followed by Rank Biased Overlap (RBO) analysis to quantify the level of overlap among the department-specific models.
RESULTS
Our findings demonstrate the efficiency of ML models in predicting and categorizing VTE risk among post-operative patients. The top-performing ML model achieved an F1 score of 0.76 with an area under the receiver operating characteristic curve (AUROC) of 0.83 and an area under the precision recall curve (AUPRC) of 0.89. A threshold of less than 0.3 in ML output probability was utilized to designate patients as low risk, while probabilities exceeding 0.7 indicated high risk. Interestingly, our investigation revealed that department-specific models failed to surpass the performance of the unified model. Analysis of feature importance correlation highlighted the length of stay (LOS) as the most influential predictor, followed by various laboratory results such as blood urea nitrogen (BUN), white blood cell count (WBC), vital signs like heart rate and temperature, as well as the patient’s age and body mass index (BMI). Our model integrates the dynamic aspects of a patient’s condition, incorporating changes in laboratory values and vital signs into its assessments. This represents a critical improvement over static models, which fail to account for the fluctuating clinical features of patients.
CONCLUSIONS
We have demonstrated the utility of machine learning models in predicting and stratifying VTE risk among post-operative patients. Accurate risk assessment plays a pivotal role in VTE prevention, enabling the development of personalized VTE prophylaxis strategies and monitoring plans. Our research marks the initial phase in creating a decision support tool to provide automated risk assessments, guiding tailored screening and prophylaxis approaches for individual patients.