Detecting Positive Selection in Populations Using Genetic Data-Reference-Cited by-同舟云学术

Detecting Positive Selection in Populations Using Genetic Data

Published:2020 Issue: Volume: Page:87-123
ISSN:1064-3745
Container-title:Methods in Molecular Biology
language:
Short-container-title:

Author:

Koropoulis Angelos,Alachiotis Nikolaos,Pavlidis Pavlos

Abstract

AbstractHigh-throughput genomic sequencing allows to disentangle the evolutionary forces acting in populations. Among evolutionary forces, positive selection has received a lot of attention because it is related to the adaptation of populations in their environments, both biotic and abiotic. Positive selection, also known as Darwinian selection, occurs when an allele is favored by natural selection. The frequency of the favored allele increases in the population and, due to genetic hitchhiking, neighboring linked variation diminishes, creating so-called selective sweeps. Such a process leaves traces in genomes that can be detected in a future time point. Detecting traces of positive selection in genomes is achieved by searching for signatures introduced by selective sweeps, such as regions of reduced variation, a specific shift of the site frequency spectrum, and particular linkage disequilibrium (LD) patterns in the region. A variety of approaches can be used for detecting selective sweeps, ranging from simple implementations that compute summary statistics to more advanced statistical approaches, e.g., Bayesian approaches, maximum-likelihood-based methods, and machine learning methods. In this chapter, we discuss selective sweep detection methodologies on the basis of their capacity to analyze whole genomes or just subgenomic regions, and on the specific polymorphism patterns they exploit as selective sweep signatures. We also summarize the results of comparisons among five open-source software releases (SweeD, SweepFinder, SweepFinder2, OmegaPlus, and RAiSD) regarding sensitivity, specificity, and execution times. Furthermore, we test and discuss machine learning methods and present a thorough performance analysis. In equilibrium neutral models or mild bottlenecks, most methods are able to detect selective sweeps accurately. Methods and tools that rely on linkage disequilibrium (LD) rather than single SNPs exhibit higher true positive rates than the site frequency spectrum (SFS)-based methods under the model of a single sweep or recurrent hitchhiking. However, their false positive rate is elevated when a misspecified demographic model is used to build the distribution of the statistic under the null hypothesis. Both LD and SFS-based approaches suffer from decreased accuracy on localizing the true target of selection in bottleneck scenarios. Furthermore, we present an extensive analysis of the effects of gene flow on selective sweep detection, a problem that has been understudied in selective sweep literature.

Publisher

Springer US

Link

https://link.springer.com/content/pdf/10.1007/978-1-0716-0199-0_5

Reference104 articles.

1. Aguadé M, Langley CH (1994) Polymorphism and divergence in regions of low recombination in Drosophila. In: Non-neutral evolution. Springer, Boston, pp 67–76

2. Aguade M, Miyashita N, Langley CH (1989) Reduced variation in the yellow-achaete-scute region in natural populations of Drosophila melanogaster. Genetics 122(3):607–615

3. Akey JM, Eberle MA, Rieder MJ, Carlson CS, Shriver MD, Nickerson DA, Kruglyak L (2004) Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol 2(10):e286

4. Alachiotis N, Pavlidis P (2016) Scalable linkage-disequilibrium-based selective sweep detection: a performance guide. GigaScience 5(1):7. https://doi.org/10.1186/s13742-016-0114-9

5. Alachiotis N, Pavlidis P (2018) Raised detects positive selection based on multiple signatures of a selective sweep and SNP vectors. Commun Biol 1(1):79

Cited by 23 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Repeated global adaptation across plant species;2024-04-03

2. Mitochondrial DNA D-Loop Polymorphisms among the Galla Goats Reveals Multiple Maternal Origins with Implication on the Functional Diversity of the HSP70 Gene;Genetics Research;2024-02-05

3. Demography as a confounding factor to explain highly diverged loci between cultivated and wild rice;Plant Genetic Resources: Characterization and Utilization;2024-02

4. Pervasive selective sweeps across human gut microbiomes;2023-12-23

5. Deep convolutional and conditional neural networks for large-scale genomic data generation;PLOS Computational Biology;2023-10-30