Author:
Belbin Gillian M,Wenric Stephane,Cullina Sinead,Glicksberg Benjamin S,Moscati Arden,Wojcik Genevieve L,Shemirani Ruhollah,Beckmann Noam D,Cohain Ariella,Sorokin Elena P,Park Danny S,Ambite Jose-Luis,Ellis Steve,Auton Adam,Bottinger Erwin P.,Cho Judy H,Loos Ruth JF,Abul-husn Noura S,Zaitlen Noah A,Gignoux Christopher R,Kenny Eimear E, , ,
Abstract
AbstractUnderstanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these population labels may not adequately capture disease burdens specific to sub-populations. Here we propose a framework for repurposing data from Electronic Health Records (EHRs) in concert with genomic data to explore enrichment of disease within sub-populations. Using data from a diverse biobank in New York City, we genetically identified 17 sub-populations, and noted the presence of genetic founder effects in 7. By then linking community membership to the EHR, we were able to identify over 600 health outcomes that were statistically enriched within a specific population, with many representing known associations, and many others being novel. This work reinforces the utility of linking genomic data to EHRs, and provides a framework towards fine-scale monitoring of population health.
Publisher
Cold Spring Harbor Laboratory
Reference49 articles.
1. Reported asthma among Puerto Rican, Mexican-American, and Cuban children, 1982 through 1984.
2. Asthma Mortality in U.S. Hispanics of Mexican, Puerto Rican, and Cuban Heritage, 1990–1995
3. What makes UK Biobank special?
4. Dewey, F. E. et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354, (2016).
5. Belbin, G. M. et al. Genetic identification of a common collagen disease in puerto ricans via identity-by-descent mapping in a health system. Elife 6, (2017).