Abstract
AbstractA gene is considered essential if loss of function results in loss of viability, fitness or in disease. This concept is well established for coding genes; however, non-coding regions are thought less likely to be determinants of critical functions. Here we train a machine learning model using functional, mutational and structural features, including new genome essentiality metrics, 3D genome organization and enhancer reporter data to identify deleterious variants in non-coding regions. We assess the model for functional correlates by using data from tiling-deletion-based and CRISPR interference screens of activity of cis-regulatory elements in over 3 Mb of genome sequence. Finally, we explore two user cases that involve indels and the disruption of enhancers associated with a developmental disease. We rank variants in the non-coding genome according to their predicted deleteriousness. The model prioritizes non-coding regions associated with regulation of important genes and with cell viability, an in vitro surrogate of essentiality.
Publisher
Springer Science and Business Media LLC
Subject
General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry
Cited by
73 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献