Abstract
AbstractRecent advances in long-read sequencing opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. They also emphasized the need for centromere annotation (partitioning human centromeres into monomers and higher-order repeats (HORs)). Even though there was a half-century-long series of semi-manual studies of centromere architecture, a rigorous centromere annotation algorithm is still lacking. Moreover, an automated centromere annotation is a prerequisite for studies of genetic diseases associated with centromeres, and evolutionary studies of centromeres across multiple species. Although the monomer decomposition (transforming a centromere into a monocentromere written in the monomer alphabet) and the HOR decomposition (representing a monocentromere in the alphabet of HORs) are currently viewed as two separate problems, we demonstrate that they should be integrated into a single framework in such a way that HOR (monomer) inference affects monomer (HOR) inference. We thus developed the HORmon algorithm that integrates the monomer/HOR inference and automatically generates the human monomers/HORs that are largely consistent with the previous semi-manual inference.
Publisher
Cold Spring Harbor Laboratory
Reference32 articles.
1. Ahuja, R. , Magnati, T. , Orlin, J. (1993) Network Flows: Theory, Algorithms, and Applications.
2. Alpha-satellite DNA of primates: old and new families
3. Altemose, N. , Logsdon, G. A. , Bzikadze, A.V. et al. (2021) Complete genomic and epigenetic maps of human centromeres bioRxiv 2021.07.12.452052; doi: https://doi.org/10.1101/2021.07.12.452052
4. Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data;PLoS Computational Biology,2007
5. Repetitive Fragile Sites: Centromere Satellite DNA As a Source of Genome Instability in Human Diseases
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献