Automatic Identification of Analogue Series from Large Compound Data Sets: Methods and Applications-Reference-Cited by-同舟云学术

Automatic Identification of Analogue Series from Large Compound Data Sets: Methods and Applications

Published:2021-08-31 Issue:17 Volume:26 Page:5291
ISSN:1420-3049
Container-title:Molecules
language:en
Short-container-title:Molecules

Author:

Naveja José J.^ORCID,Vogt Martin^ORCID

Abstract

Analogue series play a key role in drug discovery. They arise naturally in lead optimization efforts where analogues are explored based on one or a few core structures. However, it is much harder to accurately identify and extract pairs or series of analogue molecules in large compound databases with no predefined core structures. This methodological review outlines the most common and recent methodological developments to automatically identify analogue series in large libraries. Initial approaches focused on using predefined rules to extract scaffold structures, such as the popular Bemis–Murcko scaffold. Later on, the matched molecular pair concept led to efficient algorithms to identify similar compounds sharing a common core structure by exploring many putative scaffolds for each compound. Further developments of these ideas yielded, on the one hand, approaches for hierarchical scaffold decomposition and, on the other hand, algorithms for the extraction of analogue series based on single-site modifications (so-called matched molecular series) by exploring potential scaffold structures based on systematic molecule fragmentation. Eventually, further development of these approaches resulted in methods for extracting analogue series defined by a single core structure with several substitution sites that allow convenient representations, such as R-group tables. These methods enable the efficient analysis of large data sets with hundreds of thousands or even millions of compounds and have spawned many related methodological developments.

Publisher

MDPI AG

Subject

Chemistry (miscellaneous),Analytical Chemistry,Organic Chemistry,Physical and Theoretical Chemistry,Molecular Medicine,Drug Discovery,Pharmaceutical Science

Link

https://www.mdpi.com/1420-3049/26/17/5291/pdf

Reference114 articles.

1. Local Structural Changes, Global Data Views: Graphical Substructure−Activity Relationship Trailing

2. Computational Method for the Systematic Identification of Analog Series and Key Compounds Representing Series and Their Biological Activity Profiles

3. Systematic Extraction of Analogue Series from Large Compound Collections Using a New Computational Compound–Core Relationship Method

4. The Practice of Medicinal Chemistry

5. SAR Maps: A New SAR Visualization Technique for Medicinal Chemists

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Visualization, Exploration, and Screening of Chemical Space in Drug Discovery;Computational Drug Discovery;2024-01-19

2. Cheminformatics and artificial intelligence for accelerating agrochemical discovery;Frontiers in Chemistry;2023-11-29

3. ZINC-22─A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery;Journal of Chemical Information and Modeling;2023-02-15

4. Data-Driven Approaches Used for Compound Library Design for the Treatment of Parkinson’s Disease;International Journal of Molecular Sciences;2023-01-06

5. Scaffold Generator: a Java library implementing molecular scaffold functionalities in the Chemistry Development Kit (CDK);Journal of Cheminformatics;2022-11-10