Traditional Machine and Deep Learning for Predicting Toxicity Endpoints-Reference-Cited by-同舟云学术

Traditional Machine and Deep Learning for Predicting Toxicity Endpoints

Published:2022-12-26 Issue:1 Volume:28 Page:217
ISSN:1420-3049
Container-title:Molecules
language:en
Short-container-title:Molecules

Author:

Norinder Ulf

Abstract

Molecular structure property modeling is an increasingly important tool for predicting compounds with desired properties due to the expensive and resource-intensive nature and the problem of toxicity-related attrition in late phases during drug discovery and development. Lately, the interest for applying deep learning techniques has increased considerably. This investigation compares the traditional physico-chemical descriptor and machine learning-based approaches through autoencoder generated descriptors to two different descriptor-free, Simplified Molecular Input Line Entry System (SMILES) based, deep learning architectures of Bidirectional Encoder Representations from Transformers (BERT) type using the Mondrian aggregated conformal prediction method as overarching framework. The results show for the binary CATMoS non-toxic and very-toxic datasets that for the former, almost equally balanced, dataset all methods perform equally well while for the latter dataset, with an 11-fold difference between the two classes, the MolBERT model based on a large pre-trained network performs somewhat better compared to the rest with high efficiency for both classes (0.93–0.94) as well as high values for sensitivity, specificity and balanced accuracy (0.86–0.87). The descriptor-free, SMILES-based, deep learning BERT architectures seem capable of producing well-balanced predictive models with defined applicability domains. This work also demonstrates that the class imbalance problem is gracefully handled through the use of Mondrian conformal prediction without the use of over- and/or under-sampling, weighting of classes or cost-sensitive methods.

Funder

Swedish Foundation for Strategic Environmental Research

Publisher

MDPI AG

Subject

Chemistry (miscellaneous),Analytical Chemistry,Organic Chemistry,Physical and Theoretical Chemistry,Molecular Medicine,Drug Discovery,Pharmaceutical Science

Link

https://www.mdpi.com/1420-3049/28/1/217/pdf

Reference43 articles.

1. Innovation in the pharmaceutical industry: New estimates of R&D costs;DiMasi;J. Health Econ.,2016

2. Failure of Investigational Drugs in Late-Stage Clinical Development and Publication of Trial Results;Hwang;JAMA Intern. Med.,2016

3. Towards reproducible computational drug discovery;Schaduangrat;J. Cheminform.,2020

4. Current trends in computer aided drug design and a highlight of drugs discovered via computational techniques: A review;Sabe;Eur. J. Med. Chem.,2021

5. Lin, X., Li, X., and Lin, X. (2020). A Review on Applications of Computational Methods in Drug Screening and Design. Molecules, 25.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CPSign: conformal prediction for cheminformatics modeling;Journal of Cheminformatics;2024-06-28

2. CPSign - Conformal Prediction for Cheminformatics Modeling;2023-11-22

3. Applicability domains of neural networks for toxicity prediction;AIMS Mathematics;2023