Abstract
AbstractA major task of microbiome epidemiology is association analysis, where the goal is to identify microbial features related to host health. This is commonly performed by differential abundance (DA) analysis, which, by design, examines each microbe as isolated from the rest of the microbiome. This does not properly account for the microbiome’s compositional nature or microbe-microbe ecological interactions, and can lead to confounded findings, i.e., microbes that only appear to associate with health through their confounding association with health-related, biologically informative microbes. To remedy these issues, we present Compositional Differential Abundance (CompDA) analysis, a novel approach for health-microbiome association. CompDA provides a novel approach to identify health-related microbes by examining the microbiome holistically, which a) accounts for the data’s compositionality and ecological interactions, and b) has clear interpretations corresponding to host health as affected by microbiome-based interventions. CompDA prioritizes health-related microbes and controls false discoveries by implementing recent advances from high-dimensional statistics, and can be flexibly adapted to many common tasks in modern microbiome epidemiology, including enhancing microbiome-based machine learning by providing rigorous p-values to prioritize important features. We validate the performance of CompDA, and compare against canonical microbiome association methods including DA with extensive, real-data-informed simulation studies. Lastly, we report novel and consistent findings of CompDA in application, based on re-examination of recently reported microbial signatures of colorectal cancer in a meta-analysis.
Publisher
Cold Spring Harbor Laboratory