Author:
Adhikari Kamala,Patten Scott B,Patel Alka B,Premji Shahirose,Tough Suzanne,Letourneau Nicole,Giesbrecht Gerald,Metcalfe Amy
Abstract
Data pooling from pre-existing multiple datasets can be useful to increase study sample size and statistical power to answer a research question. However, individual datasets may contain variables that measure the same construct differently, posing challenges for data pooling. Variable harmonization, an approach that can generate comparable datasets from heterogeneous sources, can address this issue in some circumstances. As an illustrative example, this paper describes the data harmonization strategies that helped generate comparable datasets across two Canadian pregnancy cohort studies– the All Our Families and the Alberta Pregnancy Outcomes and Nutrition.
Variables were harmonized considering multiple features across the datasets: the construct measured; question asked/response options; the measurement scale used; the frequency of measurement; timing of measurement, and the data structure. Completely matching, partially matching, and completely un-matching variables across the datasets were determined based on these features. Variables that were an exact match were pooled as is. Partially matching variables were synchronized across the datasets considering the frequency of measurement, the timing of measurement, and response options. Variables that were completely unmatching could not be harmonized into a single variable.
The variable harmonization strategies that were used to generate comparable cohort datasets for data pooling are applicable to other data sources. Future studies may employ or evaluate these strategies. Variable harmonization and pooling provide an opportunity to increase study power and the utility of existing data, permitting researchers to answer novel research questions in a statistically efficient, timely, and cost-efficient manner that could not be achieved using a single data source.
Subject
Information Systems and Management,Health Informatics,Information Systems,Demography
Cited by
22 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献