Abstract
AbstractGenomes are an integral component of the biological information about an organism and, logically, the more complete the genome, the more informative it is. Historically, bacterial and archaeal genomes were reconstructed from pure (monoclonal) cultures and the first reported sequences were manually curated to completion. However, the bottleneck imposed by the requirement for isolates precluded genomic insights for the vast majority of microbial life. Shotgun sequencing of microbial communities, referred to initially as community genomics and subsequently as genome-resolved metagenomics, can circumvent this limitation by obtaining metagenome-assembled genomes (MAGs), but gaps, local assembly errors, chimeras and contamination by fragments from other genomes limit the value of these genomes. Here, we discuss genome curation to improve and in some cases achieve complete (circularized, no gaps) MAGs (CMAGs). To date, few CMAGs have been generated, although notably some are from very complex systems such as soil and sediment. Through analysis of ~7000 published complete bacterial isolate genomes, we verify the value of cumulative GC skew in combination with other metrics to establish bacterial genome sequence accuracy. Interestingly, analysis of cumulative GC skew identified potential mis-assemblies in some reference genomes of isolated bacteria and the repeat sequences that likely gave rise to them. We discuss methods that could be implemented in bioinformatic approaches for curation to ensure that metabolic and evolutionary analyses can be based on very high-quality genomes.
Publisher
Cold Spring Harbor Laboratory
Reference135 articles.
1. Lack of Evidence for Plague or Anthrax on the New York City Subway;Cell Syst,2015
2. Geospatial Resolution of Human and Bacterial Diversity with City-Scale Metagenomics;Cell Syst,2015
3. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes
4. Alneberg J , Bjarnason BS , de Bruijn I , Schirmer M , Quick J , Ijaz UZ , Loman NJ , Andersson AF , Quince C . 2013. CONCOCT: Clustering cONtigs on COverage and ComposiTion. arXiv [q-bioGN]. http://arxiv.org/abs/1312.4038.
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献