Abstract
SummaryDeep metagenomic data from population studies enables genome recovery and construction of population-specific references, including new species and uncovering microbial diversity that global references might miss. We constructed an Estonian population-specific reference of metagenome-assembled genomes (MAGs) from 1,878 stool samples of the EstMB-deep cohort. We assembled 84,762 MAGs, representing 2,257 species, including 353 potentially novel species (15.6%). Additionally, 607 species (26.9%) were not present in the global Unified Human Gastrointestinal Genome (UHGG) reference database and may therefore be population-specific. We further demonstrated the value ofde novoassembly of bacterial genomes by analysing associations with 33 prevalent diseases and detected 44 significant associations for 15 diseases, including with 10 potentially new species and 5 species absent from UHGG. The correlations, especially with new species, demonstrate thatde novobacterial genome assembly from population cohorts can provide significant novel insights linking the microbiome with prevalent diseases and uncovering population-specific differences.
Publisher
Cold Spring Harbor Laboratory