Abstract
AbstractUnderstanding both global and local patterns in the structure and interplay of microbial communities has been a fundamental question in ecological research. In this paper, we present a python toolbox that combines two emerging techniques that have been proposed as useful when analyzing compositional microbial data. On one hand, we introduce a visualization module that incorporates the use of UMAP, a recent dimensionality reduction technique that focuses on local patterns, and HDBSCAN, a clustering technique based on density. On the other hand, we have included a module that runs an enhanced version of the SparCC code, sustaining larger datasets than before, and we couple this with network theory analyses to describe the resulting co-occurrence networks, including several novel analyses, such as structural balance metrics and a proposal to discover the underlying topology of a co-occurrence network. We validated the proposed toolbox on 1) a simple and well described biological network of kombucha, consisting of 48 ASVs, and 2) using simulated community networks with known topologies to show that we are able to discern between network topologies. Finally, we showcase the use of the MicNet toolbox on a large dataset from Archean Domes, consisting of more than 2,000 ASVs. Our toolbox is freely available as a github repository (https://github.com/Labevo/MicNetToolbox), and it is accompanied by a web dashboard (http://micnetapplb-1212130533.us-east-1.elb.amazonaws.com) that can be used in a simple and straightforward manner with relative abundance data.Author SummaryMicrobial communities are complex systems that cannot be wholly understood when studied by its individual components. Hence, global pattern analyses seem to be a promising complement to highly focused local approaches. Here, we introduce the MicNet toolbox, an open-source collection of several analytical methods for visualizing abundance data and creating co-occurrence networks for further analysis. We include two modules: one for visualization and one for network analysis based on graph theory. Additionally, we introduce an enhanced version of SparCC, a method to estimate correlations for co-occurrence network construction, that is faster and can support larger datasets. We performed method validations using simulated data and a simple biological network. Our toolbox is freely available in a github repository at https://github.com/Labevo/MicNetToolbox, and it is accompanied by a web dashboard that could be easily accessed and manipulated by non-specialist users. With this implementation, we attempt to provide a simple and straightforward way to explore and analyze microbial relative abundance data.
Publisher
Cold Spring Harbor Laboratory