Developing and validating predictive decision tree models from mining chemical structural fingerprints and high–throughput screening data in PubChem-Reference-Cited by-同舟云学术

Developing and validating predictive decision tree models from mining chemical structural fingerprints and high–throughput screening data in PubChem

Published:2008-09-25 Issue:1 Volume:9 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Han Lianyi,Wang Yanli,Bryant Stephen H

Abstract

Abstract Background Recent advances in high-throughput screening (HTS) techniques and readily available compound libraries generated using combinatorial chemistry or derived from natural products enable the testing of millions of compounds in a matter of days. Due to the amount of information produced by HTS assays, it is a very challenging task to mine the HTS data for potential interest in drug development research. Computational approaches for the analysis of HTS results face great challenges due to the large quantity of information and significant amounts of erroneous data produced. Results In this study, Decision Trees (DT) based models were developed to discriminate compound bioactivities by using their chemical structure fingerprints provided in the PubChem system http://pubchem.ncbi.nlm.nih.gov. The DT models were examined for filtering biological activity data contained in four assays deposited in the PubChem Bioassay Database including assays tested for 5HT1a agonists, antagonists, and HIV-1 RT-RNase H inhibitors. The 10-fold Cross Validation (CV) sensitivity, specificity and Matthews Correlation Coefficient (MCC) for the models are 57.2~80.5%, 97.3~99.0%, 0.4~0.5 respectively. A further evaluation was also performed for DT models built for two independent bioassays, where inhibitors for the same HIV RNase target were screened using different compound libraries, this experiment yields enrichment factor of 4.4 and 9.7. Conclusion Our results suggest that the designed DT models can be used as a virtual screening technique as well as a complement to traditional approaches for hits selection.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-9-401.pdf

Reference49 articles.

1. Burbaum JJ, Sigal NH: New technologies for high-throughput screening. Curr Opin Chem Biol 1997, 1(1):72–78.

2. Hann MM, Oprea TI: Pursuing the leadlikeness concept in pharmaceutical research. Curr Opin Chem Biol 2004, 8(3):255–263.

3. Cox B, Denyer JC, Binnie A, Donnelly MC, Evans B, Green DV, Lewis JA, Mander TH, Merritt AT, Valler MJ, et al.: Application of high-throughput screening techniques to drug discovery. Prog Med Chem 2000, 37: 83–133.

4. Walters WP, Namchuk M: Designing screens: how to make your hits a hit. Nat Rev Drug Discov 2003, 2(4):259–266.

5. Kevorkov D, Makarenkov V: Statistical analysis of systematic errors in high-throughput screening. J Biomol Screen 2005, 10(6):557–567.

Cited by 86 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Utility of human cytochrome P450 inhibition data in the assessment of drug-induced liver injury;Xenobiotica;2024-08-08

2. Use of Bioinformatics in High-Throughput Drug Screening;Advances in Bioinformatics;2024

3. Andrographolide: A Diterpenoid from Cymbopogon schoenanthus Identified as a New Hit Compound against Trypanosoma cruzi Using Machine Learning and Experimental Approaches;Journal of Chemical Information and Modeling;2023-12-26

4. From Black Boxes to Actionable Insights: A Perspective on Explainable Artificial Intelligence for Scientific Discovery;Journal of Chemical Information and Modeling;2023-12-11

5. Data Sharing in Chemistry: Lessons Learned and a Case for Mandating Structured Reaction Data;Journal of Chemical Information and Modeling;2023-07-05