Abstract
AbstractA homoplasy is a trait shared between individuals that did not arise in a common ancestor, but rather is the result of convergent evolution. SARS-CoV-2 homoplasic mutations are important to characterise, because the evidence for a mutation conferring a fitness advantage is strengthened if this mutation has evolved independently and repeatedly in separate viral lineages. Yet detecting homoplasy is difficult due to insufficient variation between sequences to construct reliable phylogenetic trees. Here, we develop a method to more robustly identify confident homoplasies. We derive a maximum likelihood (ML) tree, with taxa bearing seemingly recurrent mutations dispersed across the tree, and then, for each potentially homoplasic mutation, we derive an alternative tree where the same taxa are constrained to one clade such that the mutation is no longer homoplasic. We then compare how well the two trees fit the sequence data. Applying this method to SARS-CoV-2 yields only a few instances where the constrained trees have significantly less statistical support than unconstrained tree, suggesting phylogenetics can provide limited support for homoplasy in SARS-CoV-2 and that caution is needed when inferring evidence of convergent evolution from phylogenetic methods in the absence of evidence from other sources.
Publisher
Cold Spring Harbor Laboratory