Abstract
AbstractOptical genome mapping (OGM) is a technique that extracts partial genomic information from optically imaged and linearized DNA fragments containing fluorescently labeled short sequence patterns. This information can be used for various genomic analyses and applications, such as the detection of structural variations and copy-number variations, epigenomic profiling, and microbial species identification. Currently, the choice of labeled patterns is based on the available bio-chemical methods, and is not necessarily optimized for the application. In this work, we develop a model of OGM based on information theory, which enables the design of optimal labeling patterns for specific applications and target organism genomes. We validated the model through experimental OGM on human DNA and simulations on bacterial DNA. Our model predicts up to 10-fold improved accuracy by optimal choice of labeling patterns, which may guide future development of OGM bio-chemical labeling methods and significantly improve its accuracy and yield for applications such as epigenomic profiling and cultivation-free pathogen identification in clinical samples.
Publisher
Cold Spring Harbor Laboratory
Reference40 articles.
1. Customized optical mapping by CRISPR–Cas9 mediated DNA labeling with multiple sgRNAs;In: Nucleic Acids Research,2020
2. Anantharaman, Thomas and Bud Mishra (2001). “False positives in genomic map assembly and sequence validation”. In: Algorithms in Bioinformatics: First Inter-national Workshop, WABI 2001 Århus Denmark, August 28–31, 2001 Proceedings. Springer, pp. 27–40.
3. Identifying microbial species by single-molecule DNA optical mapping and re-sampling statistics;In: NAR Genomics and Bioinformatics,2020
4. The use of confidence or fiducial limits illustrated in the case of the binomial;In: Biometrika,1934
5. Cover, Thomas M. and Joy A. Thomas (2012). Elements of Information Theory. Wiley.