Abstract
AbstractNucleic acid molecular biology and synthetic biology are undergoing rapid advances with the emergence of designer riboswitches controlling living cells, CRISPR/Cas9-based genome editing, high-throughput RNA-based silencing, and reengineering of mRNA translation. Many of these efforts require the design of nucleic acid interactions, which relies on accurate models for DNA and RNA energetics. Existing models utilize nearest neighbor rules, which were parameterized through careful optical melting measurements. However, these relatively simple rules often fail to quantitatively account for the biophysical behavior of molecules even in vitro, let alone in vivo. This is due to the limited experimental throughput of optical melting experiments and the infinitely large space of possible motifs that can be formed. Here, we present a convolutional neural network architecture to model the energies of nucleic acid motifs, allowing for learning of representations of physical interactions that generalize to arbitrary unmeasured motifs. First, we used existing parameterizations of motif energies to train the model and demonstrate that our model is expressive enough to recapitulate the current model. Then, through training on optical melting datasets from the literature, we have shown that the model can accurately predict the thermodynamics of hairpins containing unmeasured motifs. This work demonstrates the utility of convolutional models for capturing the thermodynamic parameters that underlie nucleic acid interactions.
Publisher
Cold Spring Harbor Laboratory