Abstract
Variation in ploidy occurs naturally in select plant and animal species. Ploidy variation can also occur spontaneously or be induced during artificial propagation of fish and shellfish. Studying species and systems that have variable ploidy requires techniques to infer ploidy of individuals. Massively parallel sequencing of biallelic SNPs has been used to infer ploidy, but existing techniques have several drawbacks. These include being limited to only comparing a fixed number of ploidies (diploidy, triploidy, and tetraploidy) and requiring that heterozygous genotypes in an individual be identified prior to ploidy inference. We describe a method of inferring ploidy from sequencing of biallelic SNPs based on beta-binomial mixture models. This method is generalized to apply to any ploidy and does not require prior identification of heterozygous genotypes. We demonstrate efficacy of this method for comparing ancestral octoploidy, decaploidy, and dodecaploidy (tetraploidy, pentaploidy, and hexaploidy for the sequenced SNPs) in white sturgeon and diploidy and triploidy in Chinook salmon with amplicon sequencing (GT-seq) data. Results indicated that ploidy could be reliably estimated for individuals based on distinct distribution of log-likelihood ratios (LLR) for known ploidy samples of both species that were tested. Confidence in ploidy estimates increased with sequencing depth. We encourage users to explore the sequencing depths and LLR critical values that provide reliable estimates of ploidy for a given organism and set of SNPs. We expect that the R package provided will empower studies of genetic variation and inheritance in organisms that vary in ploidy naturally or as a result of artificial propagation practices.
Publisher
Cold Spring Harbor Laboratory