Interpretable Prediction of mRNA Abundance from Promoter Sequence using Contextual Regression Models-Reference-Cited by-同舟云学术

Interpretable Prediction of mRNA Abundance from Promoter Sequence using Contextual Regression Models

Published:2022-08-29 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Wang Song,Wang Wei

Abstract

AbstractWhile machine learning models have been successfully applied to predicting gene expression from promoter sequences, it remains a great challenge to derive intuitive interpretation of the model and reveal DNA motif grammar such as motif cooperation and distance constraint between motif sites. Previous interpretation approaches are often time-consuming or hard to learn the combinatory rules. In this work, we designed interpretable neural network models to predict the mRNA expression levels from DNA sequences. By applying the Contextual Regression framework we developed, we extracted weighted features to cluster samples into different groups, which have different gene expression levels. We performed motif analysis in each cluster and found motifs with active or repressive regulation on gene expression as well as motif combination grammars including several motif communities and distance constraints between cooperative motifs.

Publisher

Cold Spring Harbor Laboratory

Reference44 articles.

1. Abadi, M. , et al. TensorFlow: a system for Large-Scale machine learning. In, 12th USENIX symposium on operating systems design and implementation (OSDI 16). 2016. p. 265–283.

2. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks;Cell Rep,2020

3. High-resolution structure of TBP with TAF1 reveals anchoring patterns in transcriptional regulation

4. Determinants of enhancer and promoter activities of regulatory elements;Nat Rev Genet,2020

5. Synthetic promoters: designing the cis regulatory modules for controlled gene expression;Mol Biotechnol,2018