Non-B DNA: a major contributor to small- and large-scale variation in nucleotide substitution frequencies across the genome

Author:

Guiblet Wilfried M1,Cremona Marzia A234ORCID,Harris Robert S5,Chen Di6,Eckert Kristin A78,Chiaromonte Francesca289,Huang Yi-Fei58,Makova Kateryna D58ORCID

Affiliation:

1. Bioinformatics and Genomics Graduate Program, Penn State University, UniversityPark, PA 16802, USA

2. Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA

3. Department of Operations and Decision Systems, Université Laval, Canada

4. CHU de Québec – Université Laval Research Center, Canada

5. Department of Biology, Penn State University, University Park, PA 16802, USA

6. Intercollege Graduate Degree Program in Genetics, Huck Institutes of the Life Sciences, Penn State University, UniversityPark, PA 16802, USA

7. Department of Pathology, Penn State University, College of Medicine, Hershey, PA 17033, USA

8. Center for Medical Genomics, Penn State University, University Park and Hershey, PA, USA

9. EMbeDS, Sant’Anna School of Advanced Studies, 56127 Pisa, Italy

Abstract

Abstract Approximately 13% of the human genome can fold into non-canonical (non-B) DNA structures (e.g. G-quadruplexes, Z-DNA, etc.), which have been implicated in vital cellular processes. Non-B DNA also hinders replication, increasing errors and facilitating mutagenesis, yet its contribution to genome-wide variation in mutation rates remains unexplored. Here, we conducted a comprehensive analysis of nucleotide substitution frequencies at non-B DNA loci within noncoding, non-repetitive genome regions, their ±2 kb flanking regions, and 1-Megabase windows, using human-orangutan divergence and human single-nucleotide polymorphisms. Functional data analysis at single-base resolution demonstrated that substitution frequencies are usually elevated at non-B DNA, with patterns specific to each non-B DNA type. Mirror, direct and inverted repeats have higher substitution frequencies in spacers than in repeat arms, whereas G-quadruplexes, particularly stable ones, have higher substitution frequencies in loops than in stems. Several non-B DNA types also affect substitution frequencies in their flanking regions. Finally, non-B DNA explains more variation than any other predictor in multiple regression models for diversity or divergence at 1-Megabase scale. Thus, non-B DNA substantially contributes to variation in substitution frequencies at small and large scales. Our results highlight the role of non-B DNA in germline mutagenesis with implications to evolution and genetic diseases.

Funder

National Institutes of Health

Clinical and Translational Sciences Institute

Institute of Computational and Data Sciences

Huck Institutes of the Life Sciences

Eberly College of Science of the Pennsylvania State University

Pennsylvania Department of Health

Publisher

Oxford University Press (OUP)

Subject

Genetics

Reference151 articles.

1. Variation in the mutation rate across mammalian genomes;Hodgkinson;Nat. Rev. Genet.,2011

2. The effects of chromatin organization on variation in mutation rates in the genome;Makova;Nat. Rev. Genet.,2015

3. DNA fragility in the parallel evolution of pelvic reduction in stickleback fish;Xie;Science,2019

4. Patterns of nucleotide substitution in pseudogenes and functional genes;Gojobori;J. Mol. Evol.,1982

5. Neighboring base effects on substitution rates in pseudogenes;Bulmer;Mol. Biol. Evol.,1986

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3