Position-specific evolution in transcription factor binding sites, and a fast likelihood calculation for the F81 model

Author:

Selvakumar Pavitra12ORCID,Siddharthan Rahul12ORCID

Affiliation:

1. The Institute of Mathematical Sciences, Chennai, India

2. Homi Bhabha National Institute, Mumbai, India

Abstract

Transcription factor binding sites (TFBS), like other DNA sequence, evolve via mutation and selection relating to their function. Models of nucleotide evolution describe DNA evolution via single-nucleotide mutation. A stationary vector of such a model is the long-term distribution of nucleotides, unchanging under the model. Neutrally evolving sites may have uniform stationary vectors, but one expects that sites within a TFBS instead have stationary vectors reflective of the fitness of various nucleotides at those positions. We introduce ‘position-specific stationary vectors’ (PSSVs), the collection of stationary vectors at each site in a TFBS locus, analogous to the position weight matrix (PWM) commonly used to describe TFBS. We infer PSSVs for human TFs using two evolutionary models (Felsenstein 1981 and Hasegawa-Kishino-Yano 1985). We find that PSSVs reflect the nucleotide distribution from PWMs, but with reduced specificity. We infer ancestral nucleotide distributions at individual positions and calculate ‘conditional PSSVs’ conditioned on specific choices of majority ancestral nucleotide. We find that certain ancestral nucleotides exert a strong evolutionary pressure on neighbouring sequence while others have a negligible effect. Finally, we present a fast likelihood calculation for the F81 model on moderate-sized trees that makes this approach feasible for large-scale studies along these lines.

Funder

Department of Atomic Energy, Government of India

Publisher

The Royal Society

Reference38 articles.

1. Identifying protein-binding sites from unaligned DNA fragments.

2. A Feature-Based Approach to Modeling Protein–DNA Interactions

3. Dinucleotide Weight Matrices for Predicting Transcription Factor Binding Sites: Generalizing the Position Weight Matrix

4. Kulakovskiy IV Levitsky VG Oschepkov DG Vorontsov IE Makeev VJ. 2013 Learning advanced TFBS models from chip-seq data-diChIPMunk: effective construction of dinucleotide positional weight matrices. In Int. Conf. on Bioinformatics Models Methods and Algorithms vol. 2 pp. 146–150. Setúbal Portugal: SciTePress Science and Technology Publications.

5. Automated incorporation of pairwise dependency in transcription factor binding site prediction using dinucleotide weight tensors

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3