Transfer learning for drug–target interaction prediction

Author:

Dalkıran Alperen12,Atakan Ahmet13,Rifaioğlu Ahmet S45,Martin Maria J6,Atalay Rengül Çetin7,Acar Aybar C8,Doğan Tunca69,Atalay Volkan1ORCID

Affiliation:

1. Department of Computer Engineering, Middle East Technical University , Ankara 06800, Turkey

2. Department of Computer Engineering, Adana Alparslan Türkeş Science and Technology University , Adana 01250, Turkey

3. Department of Computer Engineering, Erzincan Binali Yıldırım University , Erzincan 24002, Turkey

4. Department of Computer Engineering, Iskenderun Technical University , Hatay 31200, Turkey

5. Faculty of Medicine, Institute for Computational Biomedicine, Heidelberg University and Heidelberg University Hospital , Heidelberg 69120, Germany

6. European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton CB10 1SD, United Kingdom

7. Faculty of Pulmonary and Critical Care Medicine, the University of Chicago , Chicago, IL, 60637, United States

8. Cancer Systems Biology Laboratory (Kansil), Middle East Technical University , Ankara 06800, Turkey

9. Department of Computer Engineering, Hacettepe University , Ankara 06800, Turkey

Abstract

Abstract Motivation Utilizing AI-driven approaches for drug–target interaction (DTI) prediction require large volumes of training data which are not available for the majority of target proteins. In this study, we investigate the use of deep transfer learning for the prediction of interactions between drug candidate compounds and understudied target proteins with scarce training data. The idea here is to first train a deep neural network classifier with a generalized source training dataset of large size and then to reuse this pre-trained neural network as an initial configuration for re-training/fine-tuning purposes with a small-sized specialized target training dataset. To explore this idea, we selected six protein families that have critical importance in biomedicine: kinases, G-protein-coupled receptors (GPCRs), ion channels, nuclear receptors, proteases, and transporters. In two independent experiments, the protein families of transporters and nuclear receptors were individually set as the target datasets, while the remaining five families were used as the source datasets. Several size-based target family training datasets were formed in a controlled manner to assess the benefit provided by the transfer learning approach. Results Here, we present a systematic evaluation of our approach by pre-training a feed-forward neural network with source training datasets and applying different modes of transfer learning from the pre-trained source network to a target dataset. The performance of deep transfer learning is evaluated and compared with that of training the same deep neural network from scratch. We found that when the training dataset contains fewer than 100 compounds, transfer learning outperforms the conventional strategy of training the system from scratch, suggesting that transfer learning is advantageous for predicting binders to under-studied targets. Availability and implementation The source code and datasets are available at https://github.com/cansyl/TransferLearning4DTI. Our web-based service containing the ready-to-use pre-trained models is accessible at https://tl4dti.kansil.org.

Funder

TUBITAK

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Cited by 17 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3