Abstract
CRISPR-based genome editing relies on guide RNA sequences to target specific regions of interest. A large number of methods have been developed to predict how efficient different guides are at inducing indels. As more experimental data becomes available, methods based on machine learning have become more prominent. Here, we explore whether quantifying the uncertainty around these predictions can be used to design better guide selection strategies. We demonstrate how using a deep ensemble approach achieves better performance than utilising a single model. This approach can also provide uncertainty quantification. This allows to design, for the first time, strategies that consider uncertainty in guide RNA selection. These strategies achieve precision over 91% and can identify suitable guides for more than 93% of genes in the mouse genome. Our deep ensemble model is available athttps://github.com/bmdslab/CRISPR_DeepEnsemble.
Publisher
Cold Spring Harbor Laboratory