Abstract
AbstractThe complete biosynthetic pathways are unknown for most natural products (NPs), it is thus valuable to make computer-aided bio-retrosynthesis predictions. Here, a navigable and user-friendly toolkit, BioNavi-NP, is developed to predict the biosynthetic pathways for both NPs and NP-like compounds. First, a single-step bio-retrosynthesis prediction model is trained using both general organic and biosynthetic reactions through end-to-end transformer neural networks. Based on this model, plausible biosynthetic pathways can be efficiently sampled through an AND-OR tree-based planning algorithm from iterative multi-step bio-retrosynthetic routes. Extensive evaluations reveal that BioNavi-NP can identify biosynthetic pathways for 90.2% of 368 test compounds and recover the reported building blocks as in the test set for 72.8%, 1.7 times more accurate than existing conventional rule-based approaches. The model is further shown to identify biologically plausible pathways for complex NPs collected from the recent literature. The toolkit as well as the curated datasets and learned models are freely available to facilitate the elucidation and reconstruction of the biosynthetic pathways for NPs.
Funder
National Natural Science Foundation of China
Publisher
Springer Science and Business Media LLC
Subject
General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry,Multidisciplinary
Reference75 articles.
1. Dictionary of natural products (dnp), version 29.2. http://dnp.chemnetbase.com (Accessed 2021, April 8).
2. Banerjee, P. et al. Super natural II—a database of natural products. Nucleic Acids Res. 43, D935–D939 (2015).
3. Franck, B. Key building blocks of natural product biosynthesis and their significance in chemistry and medicine. Angew. Chem. Int Ed. Engl. 18, 429–439 (1979).
4. Walsh, C. T. & Tang, Y. Natural product biosynthesis: Chemical logic and enzymatic machinery. Royal Society of Chemistry (2017).
5. Caspi, R. et al. The MetaCyc database of metabolic pathways and enzymes - a 2019 update. Nucleic Acids Res. 48, D445–D453 (2020).
Cited by
53 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献