Prediction of regulatory motifs from human Chip-sequencing data using a deep learning framework

Author:

Yang Jinyu12,Ma Anjun1ORCID,Hoppe Adam D34,Wang Cankun1,Li Yang5,Zhang Chi6,Wang Yan7ORCID,Liu Bingqiang5,Ma Qin1

Affiliation:

1. Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA

2. Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX 76010, USA

3. Department of Chemistry and Biochemistry, South Dakota State University, Brookings, SD 57007, USA

4. BioSNTR, Brookings, SD 57007, USA

5. School of Mathematics, Shandong University, Jinan 250100, China

6. Department of Medical and Molecular Genetics, School of Medicine, Indiana University, Indianapolis, IN 46202, USA

7. School of Artificial Intelligence, Jilin University, Changchun 130012, China

Abstract

Abstract The identification of transcription factor binding sites and cis-regulatory motifs is a frontier whereupon the rules governing protein–DNA binding are being revealed. Here, we developed a new method (DEep Sequence and Shape mOtif or DESSO) for cis-regulatory motif prediction using deep neural networks and the binomial distribution model. DESSO outperformed existing tools, including DeepBind, in predicting motifs in 690 human ENCODE ChIP-sequencing datasets. Furthermore, the deep-learning framework of DESSO expanded motif discovery beyond the state-of-the-art by allowing the identification of known and new protein–protein–DNA tethering interactions in human transcription factors (TFs). Specifically, 61 putative tethering interactions were identified among the 100 TFs expressed in the K562 cell line. In this work, the power of DESSO was further expanded by integrating the detection of DNA shape features. We found that shape information has strong predictive power for TF–DNA binding and provides new putative shape motif information for human TFs. Thus, DESSO improves in the identification and structural analysis of TF binding sites, by integrating the complexities of DNA binding into a deep-learning framework.

Funder

National Science Foundation

National Institutes of Health

National Natural Science Foundation of China

Shandong University

Innovation Method Fund of China

Shanghai Municipal Science and Technology

Jilin Province

Publisher

Oxford University Press (OUP)

Subject

Genetics

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3