Abstract
AbstractSummaryMetaCerberus is an exclusive HMM/HMMER-based tool that is massively parallel, on low memory, and provides rapid scalable annotation for functional gene inference across genomes to metacommunities. It provides robust enumeration of functional genes and pathways across many current public databases including KEGG (KO), COGs, CAZy, FOAM, and viral specific databases (i.e., VOGs and PHROGs). In a direct comparison, MetaCerberus was twice as fast as EggNOG-Mapper, and produced better annotation of viruses, phages, and archaeal viruses than DRAM, PROKKA, or InterProScan. MetaCerberus annotates more KOs across domains when compared to DRAM, with a 186x smaller database and a third less memory. MetaCerberus is fully integrated with differential statistical tools (i.e., DESeq2 and edgeR), pathway enrichment (GAGE R), and Pathview R for quantitative elucidation of metabolic pathways. MetaCerberus implements the key to unlocking the biosphere across the tree of life at scale.Availability and implementationMetaCerberus is written in Python and distributed under a BSD-3 license. The source code of MetaCerberus is freely available athttps://github.com/raw-lab/metacerberus. Written in python 3 for both Linux and Mac OS X. MetaCerberus can also be easily installed using mamba create –n metacerberus –c bioconda –c conda-forge metacerberus
Publisher
Cold Spring Harbor Laboratory
Reference55 articles.
1. Illumina throughput specs (date accessed July 17th, 2023). https://www.illumina.com/systems/sequencing-platforms/novaseq-x-plus.html
2. Oxford throughput specs (date accessed July 17th, 2023). https://nanoporetech.com/about-us/news/highest-throughput-yet-promethion-breaks-7-terabase-mark
3. Genome Taxonomy Database (GTDB) release statistics (date accessed July 17th, 2023). https://gtdb.ecogenomic.org/
4. Short Read Archive Biosample Metagenomes (date accessed July 17th, 2023). https://www.ncbi.nlm.nih.gov/sra/?term=metagenomes
5. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea