Author:
Song Li-Fu,Geng Feng,Gong Zi-Yi,Li Bing-Zhi,Yuan Ying-Jin
Abstract
AbstractDNA data storage is a rapidly developing technology with great potential due to its high density, long-term stability, and low maintenance cost. The major technical challenges include various errors such as the indels, rearrangements, and strand breaks that frequently arise during DNA synthesis, amplification, sequencing, and preservation. Here, we report the development of a de Bruijn graph-based, greedy path search algorithm (DBG-GPS) aiming to handle such issues. We demonstrate that accurate data recovery could be achieved from deep error-prone PCR products. The robustness of this method was further verified with one hundred parallel data retrievals from PCR products containing numerous rearrangements introduced by intended unspecific amplifications. With DBG-GPS, we successfully decoded the original information even after treating the DNA at 70°C for 65 days. DBG-GPS shows linear decoding complexity and is faster than the multiple alignment-based methods by two orders of magnitude, suited for large-scale data storage.
Publisher
Cold Spring Harbor Laboratory
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献