Abstract
AbstractDNA, with its high storage density and long-term stability, is a potential candidate for a next-generation storage device. The DNA data storage channel, comprised of synthesis, amplification, storage, and sequencing, exhibits error probabilities and error profiles specific to the components of the channel. Here, we present Autoturbo-DNA, a PyTorch framework for training error-correcting, overcomplete autoencoders specifically tailored for the DNA data storage channel. It allows training different architecture combinations and using a wide variety of channel component models for noise generation during training. It further supports training the encoder to generate DNA sequences that adhere to user-defined constraints.
Publisher
Cold Spring Harbor Laboratory
Reference20 articles.
1. Molecular digital data storage using DNA
2. DNA storage: research landscape and future prospects
3. A. El-Shaikh , M. Welzel , D. Heider , and B. Seeger , “High-scale random access on DNA storage systems,” NAR Genomics and Bioinformatics, vol. 4, jan 2022.
4. “Fractal construction of constrained code words for DNA storage systems;Nucleic Acids Research,2021
5. P. M. Schwarz and B. Freisleben , “NOREC4DNA: Using near-optimal rateless erasure codes for DNA storage,” BMC Bioinformatics, vol. 22, no. 1, 2021.