Abstract
AbstractHigh-dimensional multi-omics microbiome data plays an important role in elucidating microbial communities’ interactions with their hosts and environment in critical diseases and ecological changes. Although Bayesian clustering methods have recently been used for the integrated analysis of multi-omics data, no method designed to analyze multi-omics microbiome data has been proposed. In this study, we propose a novel framework called integrative stochastic variational variable selection (I-SVVS), which is an extension of stochastic variational variable selection for high-dimensional microbiome data. The I-SVVS approach addresses a specific Bayesian mixture model for each type of omics data, such as an infinite Dirichlet multinomial mixture model for microbiome data and an infinite Gaussian mixture model for metabolomic data. This approach is expected to reduce the computational time of the clustering process and improve the accuracy of the clustering results. This method can also identify a critical set of representative variables in multi-omics micro-biome data. Three datasets from soybean, mice, and humans (each set integrated microbiome and metabolome) were used to demonstrate the potential of I-SVVS. The results suggest that I-SVVS achieved better accuracy and significantly faster computation than the existing methods in all cases of testing datasets and was able to identify the important microbiome species and metabolites that characterized a cluster.
Publisher
Cold Spring Harbor Laboratory