CRMnet: A deep learning model for predicting gene expression from large regulatory sequence datasets-Reference-Cited by-同舟云学术

CRMnet: A deep learning model for predicting gene expression from large regulatory sequence datasets

Published:2023-03-14 Issue: Volume:6 Page:
ISSN:2624-909X
Container-title:Frontiers in Big Data
language:
Short-container-title:Front. Big Data

Author:

Ding Ke,Dixit Gunjan,Parker Brian J.,Wen Jiayu

Abstract

Recent large datasets measuring the gene expression of millions of possible gene promoter sequences provide a resource to design and train optimized deep neural network architectures to predict expression from sequences. High predictive performance due to the modeling of dependencies within and between regulatory sequences is an enabler for biological discoveries in gene regulation through model interpretation techniques. To understand the regulatory code that delineates gene expression, we have designed a novel deep-learning model (CRMnet) to predict gene expression in Saccharomyces cerevisiae. Our model outperforms the current benchmark models and achieves a Pearson correlation coefficient of 0.971 and a mean squared error of 3.200. Interpretation of informative genomic regions determined from model saliency maps, and overlapping the saliency maps with known yeast motifs, supports that our model can successfully locate the binding sites of transcription factors that actively modulate gene expression. We compare our model's training times on a large compute cluster with GPUs and Google TPUs to indicate practical training times on similar datasets.

Publisher

Frontiers Media SA

Subject

Artificial Intelligence,Information Systems,Computer Science (miscellaneous)

Reference28 articles.

1. “Sanity checks for saliency maps,”;Adebayo;Advances in neural information processing systems 31,2018

2. Effective gene expression prediction from sequence by integrating long-range interactions;Avsec;Nat. Methods,2021

3. The meme suite;Bailey;Nucleic Acids Res.,2015

4. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles;Castro-Mondragon;Nucleic Acids Res.,2022

5. TransUNet: transformers make strong encoders for medical image segmentation;Chen;arXiv preprint arXiv:2102.04306,2021

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Deep-Learning Uncovers certain CCM Isoforms as Transcription Factors;Frontiers in Bioscience-Landmark;2024-02-21

2. Proformer: a hybrid macaron transformer model predicts expression values from promoter sequences;BMC Bioinformatics;2024-02-20