Abstract
AbstractMicrosatellites (MS) are tandem repeats of short units and have been used for population genetics, individual identification, and medical genetics. However, studies of MS on a whole genome level are limited, and genotyping methods for MS have yet to be established. Here, we analyzed approximately 8.5 million MS regions using a previously developed MS caller (MIVcall method) for three large publicly available human genome sequencing data sets: the Korean Personal Genome Project (KPGP), Simons Genome Diversity Project (SGDP), and Human Genome Diversity Project (HGDP). Our analysis identified 253,114 polymorphic MS. A comparison among different populations suggests that MS in the coding region evolved by random genetic drift and natural selection. In an analysis of genetic structures, MS clearly revealed population structures as SNPs and detected clusters that were not found by SNPs in African and Oceanian populations. Based on the MS polymorphisms, we selected an effective MS set for individual identification. We also showed that our MS analysis method can be applied to ancient DNA samples. This study provides a comprehensive picture of MS polymorphisms and application to human population studies.
Publisher
Cold Spring Harbor Laboratory