Abstract
A new approach to the coalescent, the fractional coalescent (f-coalescent), is introduced. Two derivations are presented: first, the f-coalescent is based on an extension of the discrete-time Wright-Fisher model. In this extension, for the population of size N, the probability that two randomly selected individuals have the same parent in the previous generation depends on the variable α. Second, the f-coalescent is based on an extension of the discrete-time Canning population model that the variance of the number of offspring is assumed as a random variable which depends on the variable α. In the second derivation, the f-coalescent emerges also as a continuous-time semi-Markov process. The additional parameter α affects the variability in the patterns of the waiting times; values of α < 1 lead to an increase of short time intervals, but allows occasionally for very long time intervals. When α = 1, the f-coalescent and Kingman’s n-coalescent are equivalent. The mode of the distribution of the time of the most recent common ancestor in the f-coalescent is lower than n-coalescent when the number of sample size increases, and the time which modes happen on that is smaller compare to n-coalescent. Also, this distribution showed that the f-coalescent leads to distributions with heavier tails than the n-coalescent. Also, the probability that n genes descend from m ancestral genes for f-coalescent is derived. The f-coalescent has been implemented in the population genetic model inference software MIGRATE. Simulation studies suggest that it is possible to infer the correct α values from data that was generated with known α values. When data is simulated using models with α < 1 or for three example datasets (H1N1 influenza, Malaria parasites, Humpback whales), Bayes factor comparisons show an improved model fit of the f-coalescent over the n-coalescent.
Publisher
Cold Spring Harbor Laboratory