Abstract
AbstractStructure-constrained molecular optimisation aims to improve the target pharmacological properties of input molecules through small perturbations of the molecular structures. Previous studies have exploited various optimisation techniques to satisfy the requirements of structure-constrained molecular optimisation tasks. However, several studies have encountered difficulties in producing property-improved and synthetically feasible molecules. To achieve both property improvement and synthetic feasibility of molecules, we proposed a molecular structure editing model called SELF-EdiT that uses self-referencing embedded strings (SELFIES) and Levenshtein transformer models. The SELF-EdiT generates new molecules that resemble the seed molecule by iteratively applying fragment-based deletion-and-insertion operations to SELFIES. The SELF-EdiT exploits a grammar-based SELFIES tokenization method and the Levenshtein transformer model to efficiently learn deletion-and-insertion operations for editing SELFIES. Our results demonstrated that SELF-EdiT outperformed existing structure-constrained molecular optimisation models by a considerable margin of success and total scores on the two benchmark datasets. Furthermore, we confirmed that the proposed model could improve the pharmacological properties without large perturbations of the molecular structures through edit-path analysis. Moreover, our fragment-based approach significantly relieved the SELFIES collapse problem compared to the existing SELFIES-based model. SELF-EdiT is the first attempt to apply editing operations to the SELFIES to design an effective editing-based optimisation, which can be helpful for fellow researchers planning to utilise the SELFIES.
Publisher
Springer Science and Business Media LLC
Reference43 articles.
1. Mullard A (2014) New drugs cost US \$2.6 billion to develop. Nature Rev Drug Discov 13(12):877
2. Paul SM, Mytelka DS, Dunwiddie CT, Persinger CC, Munos BH, Lindborg SR, Schacht AL (2010) How to improve R &D productivity: the pharmaceutical industry’s grand challenge. Nature Rev Drug Discov 9(3):203–214
3. Verdonk ML, Hartshorn MJ (2004) Structure-guided fragment screening for lead discovery. Curr Opin Drug Discov Dev 7(4):404–410
4. Gerry CJ, Schreiber SL (2018) Chemical probes and drug leads from advances in synthetic planning and methodology. Nature Rev Drug Discov 17(5):333–352
5. Polishchuk PG, Madzhidov TI, Varnek A (2013) Estimation of the size of drug-like chemical space based on GDB-17 data. J Comput-Aided Mol Des 27(8):675–679
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献