Metaphor—A workflow for streamlined assembly and binning of metagenomes

Author:

Salazar Vinícius W1ORCID,Shaban Babak2ORCID,Quiroga Maria del Mar2ORCID,Turnbull Robert2ORCID,Tescari Edoardo2ORCID,Rossetto Marcelino Vanessa3456ORCID,Verbruggen Heroen5ORCID,Lê Cao Kim-Anh1ORCID

Affiliation:

1. Melbourne Integrative Genomics, School of Mathematics & Statistics, University of Melbourne , Parkville, VIC 3052, Victoria, Australia

2. Melbourne Data Analytics Platform (MDAP), University of Melbourne , Carlton, VIC 3053, Victoria, Australia

3. Department of Molecular and Translational Sciences, Monash University , Clayton, VIC 3168, Victoria, Australia

4. Centre for Innate Immunity and Infectious Diseases, Hudson Institute of Medical Research , Clayton, VIC 3168, Victoria, Australia

5. School of BioSciences, University of Melbourne , Parkville, VIC 3052, Victoria, Australia

6. Department of Microbiology and Immunology, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity , Parkville, VIC 3052, Victoria, Australia

Abstract

Abstract Recent advances in bioinformatics and high-throughput sequencing have enabled the large-scale recovery of genomes from metagenomes. This has the potential to bring important insights as researchers can bypass cultivation and analyze genomes sourced directly from environmental samples. There are, however, technical challenges associated with this process, most notably the complexity of computational workflows required to process metagenomic data, which include dozens of bioinformatics software tools, each with their own set of customizable parameters that affect the final output of the workflow. At the core of these workflows are the processes of assembly—combining the short-input reads into longer, contiguous fragments (contigs)—and binning, clustering these contigs into individual genome bins. The limitations of assembly and binning algorithms also pose different challenges depending on the selected strategy to execute them. Both of these processes can be done for each sample separately or by pooling together multiple samples to leverage information from a combination of samples. Here we present Metaphor, a fully automated workflow for genome-resolved metagenomics (GRM). Metaphor differs from existing GRM workflows by offering flexible approaches for the assembly and binning of the input data and by combining multiple binning algorithms with a bin refinement step to achieve high-quality genome bins. Moreover, Metaphor generates reports to evaluate the performance of the workflow. We showcase the functionality of Metaphor on different synthetic datasets and the impact of available assembly and binning strategies on the final results.

Funder

Australian Research Council

National Health and Medical Research Council

Publisher

Oxford University Press (OUP)

Subject

Computer Science Applications,Health Informatics

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3