Removing the Bottleneck: Introducing cMatch - A Lightweight Tool for Construct-Matching in Synthetic Biology-Reference-Cited by-同舟云学术

Removing the Bottleneck: Introducing cMatch - A Lightweight Tool for Construct-Matching in Synthetic Biology

Published:2022-01-10 Issue: Volume:9 Page:
ISSN:2296-4185
Container-title:Frontiers in Bioengineering and Biotechnology
language:
Short-container-title:Front. Bioeng. Biotechnol.

Author:

Casas Alexis,Bultelle Matthieu,Motraghi Charles,Kitney Richard

Abstract

We present a software tool, called cMatch, to reconstruct and identify synthetic genetic constructs from their sequences, or a set of sub-sequences—based on two practical pieces of information: their modular structure, and libraries of components. Although developed for combinatorial pathway engineering problems and addressing their quality control (QC) bottleneck, cMatch is not restricted to these applications. QC takes place post assembly, transformation and growth. It has a simple goal, to verify that the genetic material contained in a cell matches what was intended to be built - and when it is not the case, to locate the discrepancies and estimate their severity. In terms of reproducibility/reliability, the QC step is crucial. Failure at this step requires repetition of the construction and/or sequencing steps. When performed manually or semi-manually QC is an extremely time-consuming, error prone process, which scales very poorly with the number of constructs and their complexity. To make QC frictionless and more reliable, cMatch performs an operation we have called “construct-matching” and automates it. Construct-matching is more thorough than simple sequence-matching, as it matches at the functional level-and quantifies the matching at the individual component level and across the whole construct. Two algorithms (called CM_1 and CM_2) are presented. They differ according to the nature of their inputs. CM_1 is the core algorithm for construct-matching and is to be used when input sequences are long enough to cover constructs in their entirety (e.g., obtained with methods such as next generation sequencing). CM_2 is an extension designed to deal with shorter data (e.g., obtained with Sanger sequencing), and that need recombining. Both algorithms are shown to yield accurate construct-matching in a few minutes (even on hardware with limited processing power), together with a set of metrics that can be used to improve the robustness of the decision-making process. To ensure reliability and reproducibility, cMatch builds on the highly validated pairwise-matching Smith-Waterman algorithm. All the tests presented have been conducted on synthetic data for challenging, yet realistic constructs - and on real data gathered during studies on a metabolic engineering example (lycopene production).

Funder

National Physical Laboratory

Engineering and Physical Sciences Research Council

Publisher

Frontiers Media SA

Subject

Biomedical Engineering,Histology,Bioengineering,Biotechnology

Reference80 articles.

1. Model-based Tools for Optimal Experiments in Bioprocess Engineering;Abt;Curr. Opin. Chem. Eng.,2018

2. Isoprenoid Pathway Optimization for Taxol Precursor Overproduction in Escherichia coli;Ajikumar;Science,2010

3. Tuning Genetic Control through Promoter Engineering;Alper;Proc. Natl. Acad. Sci.,2005

4. Basic Local Alignment Search Tool;Altschul;J. Mol. Biol.,1990

5. Applying Statistical Design of Experiments to Understanding the Effect of Growth Medium Components on Cupriavidus Necator H16 Growth;Azubuike;Appl. Environ. Microbiol.,2020

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Engineering biology and automation–Replicability as a design principle;Engineering Biology;2024-07-12

2. Opportunities for engineering outer membrane vesicles using synthetic biology approaches;Extracellular Vesicles and Circulating Nucleic Acids;2023

3. basicsynbio and the BASIC SEVA collection: software and vectors for an established DNA assembly method;Synthetic Biology;2022-02-01