SAVI, in silico generation of billions of easily synthesizable compounds through expert-system type rules-Reference-Cited by-同舟云学术

SAVI, in silico generation of billions of easily synthesizable compounds through expert-system type rules

Published:2020-11-11 Issue:1 Volume:7 Page:
ISSN:2052-4463
Container-title:Scientific Data
language:en
Short-container-title:Sci Data

Author:

Patel Hitesh^ORCID,Ihlenfeldt Wolf-Dietrich^ORCID,Judson Philip N.^ORCID,Moroz Yurii S.^ORCID,Pevzner Yuri^ORCID,Peach Megan L.^ORCID,Delannée Victorien^ORCID,Tarasova Nadya I.^ORCID,Nicklaus Marc C.^ORCID

Abstract

Abstract We have made available a database of over 1 billion compounds predicted to be easily synthesizable, called Synthetically Accessible Virtual Inventory (SAVI). They have been created by a set of transforms based on an adaptation and extension of the CHMTRN/PATRAN programming languages describing chemical synthesis expert knowledge, which originally stem from the LHASA project. The chemoinformatics toolkit CACTVS was used to apply a total of 53 transforms to about 150,000 readily available building blocks (enamine.net). Only single-step, two-reactant syntheses were calculated for this database even though the technology can execute multi-step reactions. The possibility to incorporate scoring systems in CHMTRN allowed us to subdivide the database of 1.75 billion compounds in sets according to their predicted synthesizability, with the most-synthesizable class comprising 1.09 billion synthetic products. Properties calculated for all SAVI products show that the database should be well-suited for drug discovery. It is being made publicly available for free download from https://doi.org/10.35115/37n9-5738.

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Statistics, Probability and Uncertainty,Computer Science Applications,Education,Information Systems,Statistics and Probability

Link

http://www.nature.com/articles/s41597-020-00727-4.pdf

Reference95 articles.

1. ChemNavigator/Sigma Aldrich. iResearch Library. https://www.chemnavigator.com/cnc/products/iRL.asp (2018).

2. Bohacek, R. S., McMartin, C. & Guida, W. C. The art and practice of structure-based drug design: A molecular modeling perspective. Med. Res. Rev. 16, 3–50 (1996).

3. Ertl, P. Cheminformatics analysis of organic substituents: identification of the most common substituents, calculation of substituent properties, and automatic identification of drug-like bioisosteric groups. J. Chem. Inf. Comput. Sci. 43, 374–380 (2003).

4. Polishchuk, P. G., Madzhidov, T. I. & Varnek, A. Estimation of the size of drug-like chemical space based on GDB-17 data. J. Comput. Aided Mol. Des. 27, 675–679 (2013).

5. Gillet, V. J., Myatt, G., Zsoldos, Z. & Johnson, A. P. SPROUT, HIPPO and CAESA: Tools for de novo structure generation and estimation of synthetic accessibility. Perspect. Drug Discov. Des. 3, 34–50 (1995).

Cited by 38 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The freedom space – a new set of commercially available molecules for hit discovery;Molecular Informatics;2024-08-22

2. The IMS Library: from IN‐Stock to Virtual;ChemMedChem;2024-08-05

3. The Pan-Canadian Chemical Library: A Mechanism to Open Academic Chemistry to High-Throughput Virtual Screening;Scientific Data;2024-06-06

4. High Performance Binding Affinity Prediction with a Transformer-Based Surrogate Model;2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW);2024-05-27

5. Correlation of protein binding pocket properties with hits’ chemistries used in generation of ultra-large virtual libraries;Journal of Computer-Aided Molecular Design;2024-05-16