Affiliation:
1. Chinese Academy of Sciences Academy of Mathematics and Systems Science, , Zhongguancan East Road, Beijing 100190, China
2. Peking University Department of Probability and Statistics, , 5 Summer Palace Road, Beijing 100871, China
Abstract
Summary
Information from multiple data sources is increasingly available. However, some data sources may produce biased estimates due to biased sampling, data corruption or model misspecification. Thus there is a need for robust data combination methods that can be used with biased sources. In this paper, a robust data fusion-extraction method is proposed. Unlike existing methods, the proposed method can be applied in the important case where researchers have no knowledge of which data sources are unbiased. The proposed estimator is easy to compute and employs only summary statistics; hence it can be applied in many different fields, such as meta-analysis, Mendelian randomization and distributed systems. The proposed estimator is consistent, even if many data sources are biased, and is asymptotically equivalent to the oracle estimator that uses only unbiased data. Asymptotic normality of the proposed estimator is also established. In contrast to existing meta-analysis methods, the theoretical properties are guaranteed for our estimator, even if the number of data sources and the dimension of the parameter diverge as the sample size increases. Furthermore, the proposed method provides consistent selection for unbiased data sources with probability approaching 1. Simulation studies demonstrate the efficiency and robustness of the proposed method empirically. The method is applied to a meta-analysis dataset to evaluate surgical treatment for moderate periodontal disease and to a Mendelian randomization dataset to study the risk factors for head and neck cancer.
Publisher
Oxford University Press (OUP)
Subject
Applied Mathematics,Statistics, Probability and Uncertainty,General Agricultural and Biological Sciences,Agricultural and Biological Sciences (miscellaneous),General Mathematics,Statistics and Probability