Abstract
AbstractCounting the number of species, items, or genes that are shared between two sets is a simple calculation when sampling is complete. However, when only partial samples are available, quantifying the overlap between two sets becomes an estimation problem. Furthermore, to calculate normalized measures of β-diversity, such as the Jaccard and Sorenson-Dice indices, one must also estimate the total sizes of the sets being compared. Previous efforts to address these problems have assumed knowledge of total population sizes and then used Bayesian methods to produce unbiased estimates with quantified uncertainty. Here, we address populations of unknown size and show that this produces systematically better estimates—both in terms of central estimates and quantification of uncertainty in those estimates. We further show how to use species count data to refine estimates of population size in a Bayesian joint model of populations and overlap.
Publisher
Cold Spring Harbor Laboratory