SnakeLines: integrated set of computational pipelines for sequencing reads

Author:

Budiš Jaroslav123,Krampl Werner134,Kucharík Marcel13,Hekel Rastislav124,Goga Adrián35,Sitarčík Jozef123,Lichvár Michal13,Smol’ak Dávid14,Böhmer Miroslav134,Baláž Andrej16,Ďuriš František12,Gazdarica Juraj12,Šoltys Katarína34,Turňa Ján234,Radvánszky Ján137,Szemes Tomáš134

Affiliation:

1. Geneton Ltd. , 841 04 Bratislava , Slovakia

2. Slovak Centre of Scientific and Technical Information , 811 04 Bratislava , Slovakia

3. Comenius University Science Park , 841 04 Bratislava , Slovakia

4. Department of Molecular Biology, Faculty of Natural Sciences , Comenius University , 841 04 Bratislava , Slovakia

5. Department of Computer Science, Faculty of Mathematics, Physics and Informatics , Comenius University , 841 04 Bratislava , Slovakia

6. Department of Applied Informatics, Faculty of Mathematics, Physics and Informatics , Comenius University , 841 04 Bratislava , Slovakia

7. Institute of Clinical and Translational Research, Biomedical Research Center, Slovak Academy of Sciences , 845 05 Bratislava , Slovakia

Abstract

Abstract With the rapid growth of massively parallel sequencing technologies, still more laboratories are utilising sequenced DNA fragments for genomic analyses. Interpretation of sequencing data is, however, strongly dependent on bioinformatics processing, which is often too demanding for clinicians and researchers without a computational background. Another problem represents the reproducibility of computational analyses across separated computational centres with inconsistent versions of installed libraries and bioinformatics tools. We propose an easily extensible set of computational pipelines, called SnakeLines, for processing sequencing reads; including mapping, assembly, variant calling, viral identification, transcriptomics, and metagenomics analysis. Individual steps of an analysis, along with methods and their parameters can be readily modified in a single configuration file. Provided pipelines are embedded in virtual environments that ensure isolation of required resources from the host operating system, rapid deployment, and reproducibility of analysis across different Unix-based platforms. SnakeLines is a powerful framework for the automation of bioinformatics analyses, with emphasis on a simple set-up, modifications, extensibility, and reproducibility. The framework is already routinely used in various research projects and their applications, especially in the Slovak national surveillance of SARS-CoV-2.

Funder

Operational program Integrated Infrastructure co-financed by the European Regional Development Fund

Agentúra na Podporu Výskumu a Vývoja

Publisher

Walter de Gruyter GmbH

Subject

General Medicine

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3