Abstract
ABSTRACTProtein abundance is defined by transcriptional, post-transcriptional and post-translational regulatory mechanisms. Understanding the code for gene expression could inform novel therapies. Here, we develop a machine learning pipeline, termed SONAR, to decipher the endogenous sequence code that determines mRNA and protein abundance in human cells. SONAR predicts up to 63% of protein abundance independent of promoter or enhancer information, and reveals a strong—yet dynamic—cell-type specific sequence code. The deep knowledge of SONAR provides a map of biologically active sequence features (SFs), which we leveraged to manipulate protein expression and tailor it to a specific cell-type. Beyond its fundamental findings, our work provides novel means to improve immunotherapies and biotechnology applications.One Sentence SummarySONAR reveal the cell type-specific sequence code for mRNA and protein expression in human immune cells
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献