Abstract
AbstractIn order to control gene expression, regulatory DNA variants are commonly designed using random synthetic approaches with mutagenesis and screening. This however limits the size of the designed DNA to span merely a part of a single regulatory region, whereas the whole gene regulatory structure including the coding and adjacent non-coding regions is involved in controlling gene expression. Here, we prototype a deep neural network strategy that models whole gene regulatory structures and generates de novo functional regulatory DNA with prespecified expression levels. By learning directly from natural genomic data, without the need for large synthetic DNA libraries, our ExpressionGAN can traverse the whole sequence-expression landscape to produce sequence variants with target mRNA levels as well as natural-like properties, including over 30% dissimilarity to any natural sequence. We experimentally demonstrate that this generative strategy is more efficient than a mutational one when using purely natural genomic data, as 57% of the newly-generated highly-expressed sequences surpass the expression levels of natural controls. We foresee this as a lucrative strategy to expand our knowledge of gene expression regulation as well as increase expression control in any desired organism for synthetic biology and metabolic engineering applications.
Publisher
Cold Spring Harbor Laboratory
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献