Affiliation:
1. University of Illinois at Urbana-Champaign Department of Statistics, , 605 E. Springfield Avenue, Champaign, Illinois 61820, U.S.A
Abstract
Summary
Differential abundance tests for compositional data are essential and fundamental in various biomedical applications, such as single-cell, bulk RNA-seq and microbiome data analysis. However, because of the compositional constraint and the prevalence of zero counts in the data, differential abundance analysis on compositional data remains a complicated and unsolved statistical problem. This article proposes a new differential abundance test, the robust differential abundance test, to address these challenges. Compared with existing methods, the robust differential abundance test is simple and computationally efficient, is robust to prevalent zero counts in compositional datasets, can take the data’s compositional nature into account, and has a theoretical guarantee of controlling false discoveries in a general setting. Furthermore, in the presence of observed covariates, the robust differential abundance test can work with covariate-balancing techniques to remove potential confounding effects and draw reliable conclusions. The proposed test is applied to several numerical examples, and its merits are demonstrated using both simulated and real datasets.
Funder
National Science Foundation
Publisher
Oxford University Press (OUP)
Subject
Applied Mathematics,Statistics, Probability and Uncertainty,General Agricultural and Biological Sciences,Agricultural and Biological Sciences (miscellaneous),General Mathematics,Statistics and Probability
Reference37 articles.
1. Principal component analysis of compositional data;Aitchison,;Biometrika,1983
2. Controlling the false discovery rate: A practical and powerful approach to multiple testing;Benjamini,;J. R. Statist. Soc. B,1995
3. Testing for differential abundance in compositional counts data, with application to microbiome studies;Brill,,2020
4. Integrating single-cell transcriptomic data across different conditions, technologies, and species;Butler,;Nature Biotech.,2018
5. Multisample estimation of bacterial composition matrices in metagenomics data;Cao,;Biometrika,2020
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献