Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations

Author:

Lauterbur M. EliseORCID,Cavassim Maria Izabel A.ORCID,Gladstein Ariella L.ORCID,Gower GrahamORCID,Pope Nathaniel S.ORCID,Tsambos GeorgiaORCID,Adrion JeffORCID,Belsare SaurabhORCID,Biddanda ArjunORCID,Caudill VictoriaORCID,Cury JeanORCID,Echevarria IgnacioORCID,Haller Benjamin C.ORCID,Hasan Ahmed R.ORCID,Huang XinORCID,Iasi Leonardo Nicola MartinORCID,Noskova EkaterinaORCID,Obšteter JanaORCID,Pavinato Vitor Antonio CorrêaORCID,Pearson Alice,Peede DavidORCID,Perez Manolo F.ORCID,Rodrigues Murillo F.ORCID,Smith Chris C. R.ORCID,Spence Jeffrey P.ORCID,Teterina AnastasiaORCID,Tittes SilasORCID,Unneberg PerORCID,Vazquez Juan ManuelORCID,Waples Ryan K.ORCID,Wohns Anthony WilderORCID,Wong YanORCID,Baumdicker FranzORCID,Cartwright Reed A.ORCID,Gorjanc GregorORCID,Gutenkunst Ryan N.ORCID,Kelleher JeromeORCID,Kern Andrew D.ORCID,Ragsdale Aaron P.ORCID,Ralph Peter L.ORCID,Schrider Daniel R.ORCID,Gronau IlanORCID

Abstract

AbstractSimulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic data sets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and to the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed frameworkstdpopsimseeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version ofstdpopsimfocused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release ofstdpopsim(version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than three-fold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements tostdpopsimaim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3