A broad survey of DNA sequence data simulation tools

Author:

Alosaimi Shatha1ORCID,Bandiang Armand1,van Biljon Noelle2,Awany Denis1,Thami Prisca K13,Tchamga Milaine S S1,Kiran Anmol45,Messaoud Olfa6,Hassan Radia Ismaeel Mohammed1,Mugo Jacquiline2,Ahmed Azza7,Bope Christian D2,Allali Imane28,Mazandu Gaston K129,Mulder Nicola J2,Chimusa Emile R1ORCID

Affiliation:

1. Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa

2. Computational Biology Division, Department of Integrative Biomedical Sciences, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa

3. Botswana Harvard AIDS Institute Partnership, Gaborone, Botswana

4. Malawi-Liverpool-Wellcome Trust Clinical Research Programme, Blantyre, Malawi

5. Edinburgh University, Edinburgh, UK

6. Université de Tunis El Manar, Institut Pasteur de Tunis, LR16IPT05 Génomique Biomédicale et Oncogénétique, Tunis, 1002, Tunisia

7. Centre for Bioinformatics and Systems Biology, Faculty of Science, University of Khartoum, Sudan

8. Laboratory of Human Pathologies Biology, Department of Biology, Faculty of Sciences, and Genomic Center of Human Pathologies, Faculty of Medicine and Pharmacy, Mohammed V University in Rabat, Morocco

9. African Institute for Mathematical Sciences (AIMS), Cape Town, South Africa

Abstract

Abstract In silico DNA sequence generation is a powerful technology to evaluate and validate bioinformatics tools, and accordingly more than 35 DNA sequence simulation tools have been developed. With such a diverse array of tools to choose from, an important question is: Which tool should be used for a desired outcome? This question is largely unanswered as documentation for many of these DNA simulation tools is sparse. To address this, we performed a review of DNA sequence simulation tools developed to date and evaluated 20 state-of-art DNA sequence simulation tools on their ability to produce accurate reads based on their implemented sequence error model. We provide a succinct description of each tool and suggest which tool is most appropriate for the given different scenarios. Given the multitude of similar yet non-identical tools, researchers can use this review as a guide to inform their choice of DNA sequence simulation tool. This paves the way towards assessing existing tools in a unified framework, as well as enabling different simulation scenario analysis within the same framework.

Funder

DAAD

German Academic Exchange Programme

National Institutes of Health

National Research Foundation

Sub-Saharan African Network

DELTAS Africa Initiative

African Academy of Sciences

Accelerating Excellence in Science

New Partnership for Africa’s Development Planning and Coordinating Agency

Wellcome Trust

Publisher

Oxford University Press (OUP)

Reference57 articles.

1. Dataset generator for whole genome shotgun sequencing. Proceedings;Myers;Int. Conf. Intell. Syst. Mol. Biol.,1999

2. GenFrag 2.1: new features for more robust sequence fragment assembly benchmarks;Engle;Comput. Appl. Biosci.,1994

3. Artificially generated data sets for testing DNA sequence assembly algorithms;Engle;Genomics,1993

4. MetaSim: a sequencing simulator for genomics and metagenomics;Richter;PLoS One,2008

5. Mason--a read simulator for second generation sequencing data;Holtgrewe;Tech. Rep. FU Berlin,2010

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3