Affiliation:
1. Department of Biology, Pennsylvania State University, University Park, PA
2. Department of Statistics, Pennsylvania State University, University Park, PA
3. EMbeDS, Sant’Anna School of Advanced Studies, Pisa, Italy
4. Center for Medical Genomics, Penn State, University Park, PA
Abstract
AbstractSatellite repeats are a structural component of centromeres and telomeres, and in some instances, their divergence is known to drive speciation. Due to their highly repetitive nature, satellite sequences have been understudied and underrepresented in genome assemblies. To investigate their turnover in great apes, we studied satellite repeats of unit sizes up to 50 bp in human, chimpanzee, bonobo, gorilla, and Sumatran and Bornean orangutans, using unassembled short and long sequencing reads. The density of satellite repeats, as identified from accurate short reads (Illumina), varied greatly among great ape genomes. These were dominated by a handful of abundant repeated motifs, frequently shared among species, which formed two groups: 1) the (AATGG)n repeat (critical for heat shock response) and its derivatives; and 2) subtelomeric 32-mers involved in telomeric metabolism. Using the densities of abundant repeats, individuals could be classified into species. However, clustering did not reproduce the accepted species phylogeny, suggesting rapid repeat evolution. Several abundant repeats were enriched in males versus females; using Y chromosome assemblies or Fluorescent In Situ Hybridization, we validated their location on the Y. Finally, applying a novel computational tool, we identified many satellite repeats completely embedded within long Oxford Nanopore and Pacific Biosciences reads. Such repeats were up to 59 kb in length and consisted of perfect repeats interspersed with other similar sequences. Our results based on sequencing reads generated with three different technologies provide the first detailed characterization of great ape satellite repeats, and open new avenues for exploring their functions.
Funder
National Institutes of Health
Publisher
Oxford University Press (OUP)
Subject
Genetics,Molecular Biology,Ecology, Evolution, Behavior and Systematics
Reference89 articles.
1. A global reference for human genetic variation;1000 Genomes Project Consortium;Nature,2015
2. A paucity of heterochromatin at functional human neocentromeres;Alonso;Epigenet Chromatin,2010
3. Genomic characterization of large heterochromatic gaps in the human genome assembly;Altemose;PLoS Comput Biol,2014
4. Y-chromosome evolution: emerging insights into processes of Y-chromosome degeneration;Bachtrog;Nat Rev Genet,2013
5. Evidence for the coincident initiation of homolog pairing and synapsis during the telomere-clustering (bouquet) stage of meiotic prophase;Bass;J Cell Sci,2000
Cited by
38 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献