HelixADMET: a robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer

Author:

Zhang Shanzhuo1,Yan Zhiyuan1,Huang Yueyang1,Liu Lihang1,He Donglong1,Wang Wei2,Fang Xiaomin1ORCID,Zhang Xiaonan1,Wang Fan1,Wu Hua3,Wang Haifeng3

Affiliation:

1. Department of Natural Language Processcing, Baidu International Technology (Shenzhen) Co., Ltd , Shenzhen 518000, China

2. School of Computer Science and Technology, Harbin Institute of Technology (HIT) , Shenzhen 518000, China

3. Baidu Inc , Beijing 100000, China

Abstract

Abstract Motivation Accurate ADMET (an abbreviation for ‘absorption, distribution, metabolism, excretion and toxicity’) predictions can efficiently screen out undesirable drug candidates in the early stage of drug discovery. In recent years, multiple comprehensive ADMET systems that adopt advanced machine learning models have been developed, providing services to estimate multiple endpoints. However, those ADMET systems usually suffer from weak extrapolation ability. First, due to the lack of labelled data for each endpoint, typical machine learning models perform frail for the molecules with unobserved scaffolds. Second, most systems only provide fixed built-in endpoints and cannot be customized to satisfy various research requirements. To this end, we develop a robust and endpoint extensible ADMET system, HelixADMET (H-ADMET). H-ADMET incorporates the concept of self-supervised learning to produce a robust pre-trained model. The model is then fine-tuned with a multi-task and multi-stage framework to transfer knowledge between ADMET endpoints, auxiliary tasks and self-supervised tasks. Results Our results demonstrate that H-ADMET achieves an overall improvement of 4%, compared with existing ADMET systems on comparable endpoints. Additionally, the pre-trained model provided by H-ADMET can be fine-tuned to generate new and customized ADMET endpoints, meeting various demands of drug research and development requirements. Availability and implementation H-ADMET is freely accessible at https://paddlehelix.baidu.com/app/drug/admet/train. Supplementary information Supplementary data are available at Bioinformatics online.

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference64 articles.

1. The Tox21 robotic platform for the assessment of environmental chemicals–from vision to reality;Attene-Ramos;Drug Discov. Today,2013

2. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays;Baell;J. Med. Chem,2010

3. Tetrodotoxin: chemistry, toxicity, source, distribution and detection;Bane;Toxins,2014

4. Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 2: a discussion of chemical and biological data;Bender;Drug Discov. Today,2021

5. Opportunities and challenges using artificial intelligence (AI) in ADME/Tox;Bhhatarai;Nat. Mater,2019

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3