Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut

Author:

Vázquez-Castellanos Jorge F,García-López Rodrigo,Pérez-Brocal Vicente,Pignatelli Miguel,Moya Andrés

Abstract

Abstract Background The main limitations in the analysis of viral metagenomes are perhaps the high genetic variability and the lack of information in extant databases. To address these issues, several bioinformatic tools have been specifically designed or adapted for metagenomics by improving read assembly and creating more sensitive methods for homology detection. This study compares the performance of different available assemblers and taxonomic annotation software using simulated viral-metagenomic data. Results We simulated two 454 viral metagenomes using genomes from NCBI's RefSeq database based on the list of actual viruses found in previously published metagenomes. Three different assembly strategies, spanning six assemblers, were tested for performance: overlap-layout-consensus algorithms Newbler, Celera and Minimo; de Bruijn graphs algorithms Velvet and MetaVelvet; and read probabilistic model Genovo. The performance of the assemblies was measured by the length of resulting contigs (using N50), the percentage of reads assembled and the overall accuracy when comparing against corresponding reference genomes. Additionally, the number of chimeras per contig and the lowest common ancestor were estimated in order to assess the effect of assembling on taxonomic and functional annotation. The functional classification of the reads was evaluated by counting the reads that correctly matched the functional data previously reported for the original genomes and calculating the number of over-represented functional categories in chimeric contigs. The sensitivity and specificity of tBLASTx, PhymmBL and the k-mer frequencies were measured by accurate predictions when comparing simulated reads against the NCBI Virus genomes RefSeq database. Conclusions Assembling improves functional annotation by increasing accurate assignations and decreasing ambiguous hits between viruses and bacteria. However, the success is limited by the chimeric contigs occurring at all taxonomic levels. The assembler and its parameters should be selected based on the focus of each study. Minimo's non-chimeric contigs and Genovo's long contigs excelled in taxonomy assignation and functional annotation, respectively. tBLASTx stood out as the best approach for taxonomic annotation for virus identification. PhymmBL proved useful in datasets in which no related sequences are present as it uses genomic features that may help identify distant taxa. The k-frequencies underperformed in all viral datasets.

Publisher

Springer Science and Business Media LLC

Subject

Genetics,Biotechnology

Reference86 articles.

1. Handelsman J, Rondon MR, Brady SF, Clardy J, Goodman RM: Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. Chem Biol. 1998, 5: 245-249. 10.1016/S1074-5521(98)90108-9.

2. Wong D: Applications of Metagenomics for Industrial Bioproducts. Metagenomics Theory, Methods Appl. Vol. 2. Edited by: Marco D. 2010, Great Britain: Caister Academic Press, 141-158. 1

3. George Isabelle SB AS: Application of Metagenomics to Bioremediation. Metagenomics Theory, Methods Appl. Vol. 2. 1st edition. Edited by: Diana M. 2010, Great Britain: Caister Academic Press, 119-140.

4. Trevor C: The Potential for Investigation of Plant-microbe Interactions Using Metagenomics Methods. Metagenomics Theory, Methods Appl. Vol. 2. Edited by: Marco D. 2010, Great Britain: Caister Academic Press, 107-118. 1

5. Fujimura KE, Slusher NA, Cabana MD, Lynch SV: Role of the gut microbiota in defining human health. Expert Rev Anti Infect Ther. 2010, 8: 435-454. 10.1586/eri.10.14.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3