Assessing the Impact of Person-level Matching Algorithms to Identify Risk of Fatal Opioid Overdose Across Disparate Datasets: Retrospective Analysis (Preprint)-Reference-Cited by-同舟云学术

Assessing the Impact of Person-level Matching Algorithms to Identify Risk of Fatal Opioid Overdose Across Disparate Datasets: Retrospective Analysis (Preprint)

Published:2020-02-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Ferris Lindsey^ORCID,Weiner Jonathan P.,Saloner Brendan,Kharrazi Hadi

Abstract

BACKGROUND

The opioid epidemic in the United States has precipitated a need for public health agencies to better understand risk factors associated with fatal overdoses. Matching person-level information stored in public health, medical, and human services datasets can enhance the understanding of opioid overdose risk factors and interventions. A major impediment to using datasets from separate agencies, has been the lack of a cross-organization unique identifier. Although different matching techniques that leverage patient demographic information can be used, the impact of using a particular matching approach is not well understood.

OBJECTIVE

This study compares the impact of using probabilistic versus deterministic matching algorithms to link disparate datasets together for identifying persons at risk of a fatal overdose.

METHODS

This study used statewide prescription drug monitoring program (PDMP), arrest, and mortality data matched at the person-level using a probabilistic and two deterministic matching algorithms. Impact of matching was assessed by comparing the prevalence of key risk indicators, the outcome, and performance of a multivariate logistic regression for fatal overdose using the combined datasets.

RESULTS

The probabilistically matched population had the highest degree of matching within the PDMP data and with arrest and mortality data, resulting in the highest prevalence of high-risk indicators and the outcome. Model performance using area under the curve (AUC) was comparable across the algorithms (probabilistic: 0.847; deterministic-basic: 0.854; deterministic+zip: 0.826), but demonstrated tradeoffs between sensitivity and specificity.

CONCLUSIONS

The probabilistic algorithm was more successful in linking patients with PDMP data with death and arrest data, resulting in a larger at-risk population. However, deterministic-basic matching may be a suitable option for understanding high-level risk based on the model’s area under the curve (0.854). The clinical use case should be considered when selecting a matching approach, as probabilistic algorithms can be more resource-intensive and costly to maintain compared with deterministic algorithms.

Publisher

JMIR Publications Inc.

Reference20 articles.

1. IT-enabled Community Health Interventions: Challenges, Opportunities, and Future Directions

2. When to conduct probabilistic linkage vs. deterministic linkage? A simulation study

3. When to conduct probabilistic linkage vs. deterministic linkage? A simulation study

4. Record linkage software in the public domain: a comparison of Link Plus, The Link King, and a `basic' deterministic algorithm

5. Accuracy of Probabilistic Linkage Using the Enhanced Matching System for Public Health and Epidemiological Studies