Identifying the kind behind SMILES—anatomical therapeutic chemical classification using structure-only representations

Author:

Cao Yi1,Yang Zhen-Qun2,Zhang Xu-Lu1,Fan Wenqi3,Wang Yaowei4,Shen Jiajun5,Wei Dong-Qing6,Li Qing3,Wei Xiao-Yong13

Affiliation:

1. Department of Computer Science, Sichuan University , 610065, Chengdu, China

2. Department of Biomedical Engineering, Chinese University of Hong Kong , Street, Shatin, Hong Kong

3. Department of Computing, Hong Kong Polytechnic University , Kowloon, Hong Kong

4. Peng Cheng Laboratory , 518000, Shenzhen, China

5. TCL AI Research Institute , Hong Kong

6. School of Life Sciences and Biotechnology, Shanghai Jiao Tong University , Shanghai, China

Abstract

AbstractAnatomical Therapeutic Chemical (ATC) classification for compounds/drugs plays an important role in drug development and basic research. However, previous methods depend on interactions extracted from STITCH dataset which may make it depend on lab experiments. We present a pilot study to explore the possibility of conducting the ATC prediction solely based on the molecular structures. The motivation is to eliminate the reliance on the costly lab experiments so that the characteristics of a drug can be pre-assessed for better decision-making and effort-saving before the actual development. To this end, we construct a new benchmark consisting of 4545 compounds which is with larger scale than the one used in previous study. A light-weight prediction model is proposed. The model is with better explainability in the sense that it is consists of a straightforward tokenization that extracts and embeds statistically and physicochemically meaningful tokens, and a deep network backed by a set of pyramid kernels to capture multi-resolution chemical structural characteristics. Its efficacy has been validated in the experiments where it outperforms the state-of-the-art methods by 15.53% in accuracy and by 69.66% in terms of efficiency. We make the benchmark dataset, source code and web server open to ease the reproduction of this study.

Funder

National Natural Science Foundation of China

Hong Kong Polytechnic University

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

Reference48 articles.

1. Superpred: drug classification and target prediction;Dunkel;Nucleic Acids Res,2008

2. Network predicting drug’s anatomical therapeutic chemical code;Wang;Bioinformatics,2013

3. Superpred: update on drug classification and target prediction;Nickel;Nucleic Acids Res,2014

4. Predicting anatomical therapeutic chemical (atc) classification of drugs by integrating chemical-chemical interactions and similarities;Chen;PloS one,2012

5. iatc-misf: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals;Cheng;Bioinformatics,2017

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3