Author:
Baker Brennan H.,Sathyanarayana Sheela,Szpiro Adam A.,MacDonald James,Paquette Alison G.
Abstract
AbstractMissing covariate data is a common problem that has not been addressed in observational studies of gene expression. Here we present a multiple imputation (MI) method that accommodates high dimensional transcriptomic data by binning genes, creating separate MI datasets and differential expression models within each bin, and pooling results with Rubin’s rules. Simulation studies using real and synthetic data show that this method outperforms complete case and single imputation analyses at uncovering true positive differentially expressed genes, limiting false discovery rates, and minimizing bias. This method is easily implemented via an R package, “RNAseqCovarImpute” that integrates with the limma-voom pipeline.
Publisher
Cold Spring Harbor Laboratory
Reference34 articles.
1. Van Buuren S. Flexible imputation of missing data: CRC press; 2018.
2. Rubin DB . Multiple imputation for nonresponse in surveys: John Wiley & Sons; 2004.
3. Heymans M , Eekhout I. Applied missing data analysis with SPSS and (R) Studio. Heymans and Eekhout: Amsterdam, The Netherlands: 20Available online: https://bookdownorg/mwheymans/bookmi/[accessed 23 May 2020]. 2019.
4. Coming of age: ten years of next-generation sequencing technologies
5. Cohort profile: the ECHO prenatal and early childhood pathways to health consortium (ECHO-PATHWAYS)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献