Abstract
AbstractLong-range forensic familial searching is a new method in forensic genetics. In long-range search, a sample of interest is genotyped at single-nucleotide polymorphism (SNP) markers, and the genotype is compared with a large database in order to find relatives. Here, we perform some simple calculations that explore the basic phenomena that govern long-range searching. Two opposing phenomena—one genealogical and one genetic—govern the success of the search in a database of a given size. As one considers more distant genealogical relationships, any target sample is likely to have more relatives—on average, one has more second cousins than first cousins, and so on. But more distant relatives are also harder to detect genetically. Starting with third cousins, there is an appreciable chance that a given genealogical relationship will not be detectable genetically. Given the balance of these genealogical and genetic phenomena and the size of databases currently queryable by law enforcement, it is likely that most people with substantial recent ancestry in the United States are accessible via long-range search.NoteThis material was originally posted on the Coop lab site on May 7th, 2018, soon after the reporting of the arrest of Joseph DeAngelo in the Golden State Killer case, one of the first high-profile uses of long-range familial search. Subsequently, Erlich et al. (2018) published a detailed analysis in a large empirical dataset along with a theoretical analysis of a model similar to the one we use here, obtaining results broadly consistent with the ones presented here. Because Erlich and colleagues kindly cited this work when describing their model, we thought it would be appropriate to post this material in a venue where it is more easily cited.
Publisher
Cold Spring Harbor Laboratory
Cited by
17 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献