Quantifying molecular bias in DNA data storage-Reference-Cited by-同舟云学术

Quantifying molecular bias in DNA data storage

Published:2020-06-29 Issue:1 Volume:11 Page:
ISSN:2041-1723
Container-title:Nature Communications
language:en
Short-container-title:Nat Commun

Author:

Chen Yuan-Jyue^ORCID,Takahashi Christopher N.,Organick Lee,Bee Callista^ORCID,Ang Siena Dumas,Weiss Patrick,Peck Bill,Seelig Georg^ORCID,Ceze Luis^ORCID,Strauss Karin^ORCID

Abstract

AbstractDNA has recently emerged as an attractive medium for archival data storage. Recent work has demonstrated proof-of-principle prototype systems; however, very uneven (biased) sequencing coverage has been reported, which indicates inefficiencies in the storage process. Deviations from the average coverage in the sequence copy distribution can either cause wasteful provisioning in sequencing or excessive number of missing sequences. Here, we use millions of unique sequences from a DNA-based digital data archival system to study the oligonucleotide copy unevenness problem and show that the two paramount sources of bias are the synthesis and amplification (PCR) processes. Based on these findings, we develop a statistical model for each molecular process as well as the overall process. We further use our model to explore the trade-offs between synthesis bias, storage physical density, logical redundancy, and sequencing redundancy, providing insights for engineering efficient, robust DNA data storage systems.

Funder

United States Department of Defense | Defense Advanced Research Projects Agency

Microsoft

Publisher

Springer Science and Business Media LLC

Subject

General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry

Link

http://www.nature.com/articles/s41467-020-16958-3.pdf

Reference29 articles.

1. Zhirnov, V., Zadegan, R. M., Sandhu, G. S., Church, G. M. & Hughes, W. L. Nucleic acid memory. Nat. Mater. 15, 366–370 (2016).

2. Ceze, L., Nivala, J. & Strauss, K. Molecular digital data storage using DNA. Nat. Rev. Genet. 20, 456–466 (2019).