Abstract
AbstractTraditional machine learning (ML) approaches learn to recognize patterns in the data but fail to go beyond observing associations. Such data-driven methods can lack generalizability when the data is outside the independent and identically distributed (i.i.d) setting. Using causal inference can aid data-driven techniques to go beyond learning spurious associations and frame the data-generating process in a causal lens. We can combine domain expertise and traditional ML techniques to answer causal questions on the data. Hypothetical questions on alternate realities can also be answered with such a framework. In this paper, we estimate the causal effect of Pre-Exposure Prophylaxis (PrEP) on mortality in COVID-19 patients from an observational dataset of over 120,000 patients. With the help of medical experts, we hypothesize a causal graph that identifies the causal and non-causal associations, including the list of potential confounding variables. We use estimation techniques such as linear regression, matching, and machine learning (meta-learners) to estimate the causal effect. On average, our estimates show that taking PrEP can result in a 2.1% decrease in the death rate or a total of around 2,540 patients’ lives saved in the studied population.
Publisher
Cold Spring Harbor Laboratory
Reference19 articles.
1. J. Pearl , “Causal inference in statistics: An overview,” 2009.
2. Toward causal representation learning;Proceedings of the IEEE,2021
3. Causal inference and counterfactual prediction in machine learning for actionable health-care;Nature Machine Intelligence,2020
4. The seven tools of causal inference, with reflections on machine learning;Communications of the ACM,2019
5. J. Kaddour , A. Lynch , Q. Liu , M. J. Kusner , and R. Silva , “Causal machine learning: A survey and open problems,” arXiv preprint arXiv:2206.15475, 2022.