Efficient ancestry and mutation simulation with msprime 1.0

Author:

Baumdicker Franz1ORCID,Bisschop Gertjan2ORCID,Goldstein Daniel34ORCID,Gower Graham5ORCID,Ragsdale Aaron P6ORCID,Tsambos Georgia7ORCID,Zhu Sha8ORCID,Eldon Bjarki9ORCID,Ellerman E Castedo10ORCID,Galloway Jared G1112ORCID,Gladstein Ariella L1314ORCID,Gorjanc Gregor15ORCID,Guo Bing16ORCID,Jeffery Ben8ORCID,Kretzschumar Warren W17ORCID,Lohse Konrad2ORCID,Matschiner Michael18ORCID,Nelson Dominic19ORCID,Pope Nathaniel S20ORCID,Quinto-Cortés Consuelo D21ORCID,Rodrigues Murillo F11ORCID,Saunack Kumar22ORCID,Sellinger Thibaut23ORCID,Thornton Kevin24ORCID,van Kemenade HugoORCID,Wohns Anthony W84ORCID,Wong Yan8ORCID,Gravel Simon19ORCID,Kern Andrew D11ORCID,Koskela Jere25ORCID,Ralph Peter L1126ORCID,Kelleher Jerome8ORCID

Affiliation:

1. Cluster of Excellence “Controlling Microbes to Fight Infections”, Mathematical and Computational Population Genetics, University of Tübingen, 72076 Tübingen, Germany

2. Institute of Evolutionary Biology, The University of Edinburgh, Edinburgh EH9 3FL, UK

3. Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA

4. Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA

5. Lundbeck GeoGenetics Centre, Globe Institute, University of Copenhagen, 1350 Copenhagen K, Denmark

6. Department of Integrative Biology, University of Wisconsin–Madison, Madison, WI 53706, USA

7. Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne, Parkville, VIC 3010, Australia

8. Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, UK

9. Leibniz Institute for Evolution and Biodiversity Science, Museum für Naturkunde, Berlin 10115, Germany

10. Fresh Pond Research Institute, Cambridge, MA 02140, USA

11. Department of Biology, Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403-5289, USA

12. Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA

13. Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7264, USA

14. Embark Veterinary, Inc., Boston, MA 02111, USA

15. The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh EH25 9RG, UK

16. Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA

17. Center for Hematology and Regenerative Medicine, Karolinska Institute, 141 83 Huddinge, Sweden

18. Natural History Museum, University of Oslo, 0318 Oslo, Norway

19. Department of Human Genetics, McGill University, Montréal, QC H3A 0C7, Canada

20. Department of Entomology, Pennsylvania State University, State College, PA 16802, USA

21. National Laboratory of Genomics for Biodiversity (LANGEBIO), Unit of Advanced Genomics, CINVESTAV, Irapuato, Mexico

22. IIT Bombay, Powai, Mumbai 400 076, India

23. Professorship for Population Genetics, Department of Life Science Systems, Technical University of Munich, 85354 Freising, Germany

24. Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697, USA

25. Department of Statistics, University of Warwick, Coventry CV4 7AL, UK

26. Department of Mathematics, University of Oregon, Eugene, OR 97403-5289, USA

Abstract

Abstract Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime’s many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement.

Funder

US National Institutes of Health

Deutsche Forschungsgemeinschaft

Priority Programme SPP 1819: Rapid Evolutionary Adaptation

The Icelandic Research Centre (Rannís) through an Icelandic Research Fund Grant of Excellence

Deutsche Forschungsgemeinschaft EXC 2064/1: Project

EXC 2124: Project

Villum Fonden Young Investigator award to Fernando Racimo

Chancellor’s Fellowship of the University of Edinburgh and the UK Biotechnology and Biological Sciences Research Council grant to the Roslin Institute

UK Engineering and Physical Sciences Research Council

Robertson Foundation

Canada Research Chairs Program, from the Canadian Institutes of Health Research

Canadian Foundation for Innovation

Publisher

Oxford University Press (OUP)

Subject

Genetics

Cited by 194 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3