Abstract
ABSTRACTIntroductionThe objective of tertiary prevention is to reduce re-hospitalization, as re-hospitalization puts patients at unnecessary risk, delays those in the population requiring timely care, and incurs financial burdens on healthcare systems. Nevertheless, it is challenging to stratify the needs of tertiary prevention in a population. Hence, we advance an analytic protocol to identify from an inpatient population the clinical and service-utilization profiles of those re-hospitalized within 28 days.Methods and analysisThe protocol is based on implementing unsupervised and supervised machine learning (ML) in tandem with an inpatient population’s electronic health records. The unsupervised ML will cluster the population into segments of maximized within-segment similarity and between-segment dissimilarity, across the dimensions of clinical diagnoses, acuity, complexity, chronicity, and multimorbidity.Within each clinically similar segment identified, a 28-day re-hospitalization outcome-supervised decision tree will classify the segment into a series of binarily-split subgroups with profile and service utilization-related features. The order of selected features reflects relative importance to the outcome. Two subgroups originated from a selected feature are statistically different in re-hospitalization outcomes. So, the subgroups lacking selected services while realizing the highest re-hospitalization rates will potentially benefit most from tertiary prevention of the selected services. Thus, they are fit for further assessment and corresponding interventions.Ethics and disseminationThe Survey and Behaviour Research Ethics Committee of the Chinese University of Hong Kong, Hospital Authority Data Collaboration Lab, and a local ethics committee have approved this protocol and its ongoing and forthcoming validations in territory-wide HK through centralized data access and in local clinical management systems, respectively. In addition to disseminating through publications, presentations, and other communications, the protocol is also implementable in different systems as part of the decision support mechanism to inform the venue-based sampling of patients with a high risk of re-hospitalization.Article SummaryStrengths and Limitations of this studyStrength #1This will be a data-based machine-learning analytic methodology to align population-based cohort study with the conceptual framework of the Swiss Cheese Model for patient safety. The Swiss Cheese Model is a conceptual framework that elevates us from the paradigm of linear causality to conceptualizing safety events as having multiple lines of defense simultaneously broken through. However, the predominant linear model research is incompatible with identifying a medical system’s multiple lines of defense and their potential breakpoints, due to much less prioritizing impacts and mapping interactions on outcomes.Strength #2This protocol no long assumes independence among clinical profiles and service utilization-related factors. As a result, the absence and ensembles of post-acute services of the entire population can be handled appropriately.Strength #3Subgroups of statistically significant differences in re-hospitalization outcomes will be identified and articulable as a portfolio of clinical profiles and selected services.LimitationsHeterogeneity may still exist after the clustering and classification based on clinical homogeneity. Like all studies based on clinical databases, the heterogeneity can be attributable to exogenous factors absent in clinical information systems, e.g., psychosocial factors, and the narrowly defined tertiary prevention needs of the supervisory outcome.
Publisher
Cold Spring Harbor Laboratory