The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses-Reference-Cited by-同舟云学术

The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses

Published:2019-11-06 Issue:11 Volume:11 Page:1032
ISSN:1999-4915
Container-title:Viruses
language:en
Short-container-title:Viruses

Author:

Vakulenko Yulia^ORCID,Deviatkin Andrei^ORCID,Lukashev Alexander

Abstract

Statistical phylogenetic methods are a powerful tool for inferring the evolutionary history of viruses through time and space. The selection of mathematical models and analysis parameters has a major impact on the outcome, and has been relatively well-described in the literature. The preparation of a sequence dataset is less formalized, but its impact can be even more profound. This article used simulated datasets of enterovirus sequences to evaluate the effect of sample bias on picornavirus phylogenetic studies. Possible approaches to the reduction of large datasets and their potential for introducing additional artefacts were demonstrated. The most consistent results were obtained using “smart sampling”, which reduced sequence subsets from large studies more than those from smaller ones in order to preserve the rare sequences in a dataset. The effect of sequences with technical or annotation errors in the Bayesian framework was also analyzed. Sequences with about 0.5% sequencing errors or incorrect isolation dates altered by just 5 years could be detected by various approaches, but the efficiency of identification depended upon sequence position in a phylogenetic tree. Even a single erroneous sequence could profoundly destabilize the whole analysis by increasing the variance of the inferred evolutionary parameters.

Funder

Russian Science Foundation

Publisher

MDPI AG

Subject

Virology,Infectious Diseases

Link

https://www.mdpi.com/1999-4915/11/11/1032/pdf

Reference57 articles.

1. BEAST: Bayesian evolutionary analysis by sampling trees

2. Rates of evolutionary change in viruses: patterns and determinants

3. Bayesian coalescent inference of hepatitis A virus populations: evolutionary rates and patterns