Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF

Author:

Kuang Shuzhen12,Wang Liangjiang1

Affiliation:

1. Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA

2. Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA

Abstract

Abstract CCCTC-binding factor (CTCF) is a key regulator of 3D genome organization and gene expression. Recent studies suggest that RNA transcripts, mostly long non-coding RNAs (lncRNAs), can serve as locus-specific factors to bind and recruit CTCF to the chromatin. However, it remains unclear whether specific sequence patterns are shared by the CTCF-binding RNA sites, and no RNA motif has been reported so far for CTCF binding. In this study, we have developed DeepLncCTCF, a new deep learning model based on a convolutional neural network and a bidirectional long short-term memory network, to discover the RNA recognition patterns of CTCF and identify candidate lncRNAs binding to CTCF. When evaluated on two different datasets, human U2OS dataset and mouse ESC dataset, DeepLncCTCF was shown to be able to accurately predict CTCF-binding RNA sites from nucleotide sequence. By examining the sequence features learned by DeepLncCTCF, we discovered a novel RNA motif with the consensus sequence, AGAUNGGA, for potential CTCF binding in humans. Furthermore, the applicability of DeepLncCTCF was demonstrated by identifying nearly 5000 candidate lncRNAs that might bind to CTCF in the nucleus. Our results provide useful information for understanding the molecular mechanisms of CTCF function in 3D genome organization.

Publisher

Oxford University Press (OUP)

Subject

General Medicine

Reference90 articles.

1. Organization and function of the 3D genome;Bonev;Nat. Rev. Genet.,2016

2. Architectural proteins: regulators of 3D genome organization in cell fate;Gomez-Diaz;Trends Cell Biol.,2014

3. Three-dimensional genome architecture: players and mechanisms;Pombo;Nat. Rev. Mol. Cell Biol.,2015

4. Crossed wires: 3D genome misfolding in human disease;Norton;J. Cell Biol.,2017

5. The three-dimensional cancer genome;Corces;Curr. Opin. Genet. Dev.,2016

Cited by 10 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. CTCF and Its Multi-Partner Network for Chromatin Regulation;Cells;2023-05-10

2. Deciphering the RRM-RNA recognition code: A computational analysis;PLOS Computational Biology;2023-01-23

3. Regulation of loop extrusion on the interphase genome;Critical Reviews in Biochemistry and Molecular Biology;2023-01-02

4. SARS-CoV-2 virus classification based on stacked sparse autoencoder;Computational and Structural Biotechnology Journal;2023

5. Digitization Techniques for the Representation of Genomic Sequences in LSTM-Based Models;Intelligent Sustainable Systems;2023

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3