Hold out the genome: A roadmap to solving the cis-regulatory code-Reference-Cited by-同舟云学术

Hold out the genome: A roadmap to solving the cis-regulatory code

Published:2023-04-20 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

de Boer Carl G.^ORCID,Taipale Jussi^ORCID

Abstract

AbstractGene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The “cis-regulatory code” - the rules that cells use to determine when, where, and how much genes should be expressed - has proven to be exceedingly complex, but recent advances in the scale and resolution of functional genomics assays and Machine Learning have enabled significant progress towards deciphering this code. However, we will likely never solve the cis-regulatory code if we restrict ourselves to models trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and there is insufficient sequence diversity in our genomes to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable us to test a far larger sequence space than exists in our genomes in each experiment, and designed DNA sequences enable a targeted query of the sequence space to maximally improve the models. Since cells use the same biochemical principles to interpret DNA regardless of its source, models that are trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here, we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by training models exclusively on non-genomic DNA sequences, and using genomic sequences solely for evaluating the resulting models.

Publisher

Cold Spring Harbor Laboratory

Reference156 articles.

1. Deciphering the multi-scale, quantitative cis-regulatory code

2. The Human Transcription Factors

3. Seven myths of how transcription factors read the cis-regulatory code;Curr. Opin. Syst. Biol,2020

4. The splicing code

5. Ribosome dynamics and mRNA turnover, a complex relationship under constant cellular scrutiny;Wiley Interdiscip. Rev. RNA,2021

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Characterizing uncertainty in predictions of genomic sequence-to-activity models;2023-12-23

2. High-throughput data and modeling reveal insights into the mechanisms of cooperative DNA-binding by transcription factor proteins;Nucleic Acids Research;2023-10-27

3. GUANinE v1.0: Benchmark Datasets for Genomic AI Sequence-to-Function Models;2023-10-17

4. Active learning of enhancer and silencer regulatory grammar in photoreceptors;2023-08-22

5. Machine-guided design of synthetic cell type-specificcis-regulatory elements;2023-08-09