Abstract
ABSTRACTObjectiveThe National Institutes of Health’s All of Us Research Program addresses gaps in biomedical research by collecting health data from diverse populations. Pregnant individuals have historically been underrepresented in biomedical research, and pregnancy-related research is often limited by data availability, sample size, and inadequate representation of the diversity of pregnant people. We aimed to identify pregnancy episodes with high-quality electronic health record (EHR) data in All of Us Research Program data and evaluate the program’s utility for pregnancy-related research.Materials and MethodsWe used an algorithm to identify pregnancy episodes in All of Us EHR data. We described these pregnancies, validated them with additional data, and compared them to national statistics.ResultsOur study identified 18,970 pregnancy episodes from 14,234 participants; other possible pregnancy episodes had low-quality or insufficient data. Validation against people who reported a current pregnancy on an All of Us survey found low false positive and negative rates. Demographics were similar in some respects to national data; however, Asian-Americans were underrepresented, and older, highly educated pregnant people were overrepresented.DiscussionOur approach demonstrates the capacity of All of Us to support pregnancy research and reveals the diversity of the pregnancy cohort. However, we noted an underrepresentation among some demographics. Other limitations include measurement error in gestational age and limited data on non-live births.ConclusionThe wide variety of data in the All of Us program, encompassing EHR, survey, genomic, and Fitbit data, offers a valuable resource for studying pregnancy, yet care must be taken to avoid biases.
Publisher
Cold Spring Harbor Laboratory