GraphEGFR: Multi‐task and transfer learning based on molecular graph attention mechanism and fingerprints improving inhibitor bioactivity prediction for EGFR family proteins on data scarcity

Author:

Boonyarit Bundit1ORCID,Yamprasert Nattawin2,Kaewnuratchadasorn Pawit3,Kinchagawat Jiramet1,Prommin Chanatkran1,Rungrotmongkol Thanyada45,Nutanong Sarana1

Affiliation:

1. School of Information Science and Technology Vidyasirimedhi Institute of Science and Technology Rayong Thailand

2. School of Information, Computer, and Communication Technology, Sirindhorn International Institute of Technology Thammasat University Pathum Thani Thailand

3. Faculty of Medicine Siriraj Hospital Mahidol University Bangkok Thailand

4. Program in Bioinformatics and Computational Biology, Graduate School Chulalongkorn University Bangkok Thailand

5. Center of Excellence in Structural and Computational Biology Research Unit, Department of Biochemistry, Faculty of Science Chulalongkorn University Bangkok Thailand

Abstract

AbstractThe proteins within the human epidermal growth factor receptor (EGFR) family, members of the tyrosine kinase receptor family, play a pivotal role in the molecular mechanisms driving the development of various tumors. Tyrosine kinase inhibitors, key compounds in targeted therapy, encounter challenges in cancer treatment due to emerging drug resistance mutations. Consequently, machine learning has undergone significant evolution to address the challenges of cancer drug discovery related to EGFR family proteins. However, the application of deep learning in this area is hindered by inherent difficulties associated with small‐scale data, particularly the risk of overfitting. Moreover, the design of a model architecture that facilitates learning through multi‐task and transfer learning, coupled with appropriate molecular representation, poses substantial challenges. In this study, we introduce GraphEGFR, a deep learning regression model designed to enhance molecular representation and model architecture for predicting the bioactivity of inhibitors against both wild‐type and mutant EGFR family proteins. GraphEGFR integrates a graph attention mechanism for molecular graphs with deep and convolutional neural networks for molecular fingerprints. We observed that GraphEGFR models employing multi‐task and transfer learning strategies generally achieve predictive performance comparable to existing competitive methods. The integration of molecular graphs and fingerprints adeptly captures relationships between atoms and enables both global and local pattern recognition. We further validated potential multi‐targeted inhibitors for wild‐type and mutant HER1 kinases, exploring key amino acid residues through molecular dynamics simulations to understand molecular interactions. This predictive model offers a robust strategy that could significantly contribute to overcoming the challenges of developing deep learning models for drug discovery with limited data and exploring new frontiers in multi‐targeted kinase drug discovery for EGFR family proteins.

Funder

Vidyasirimedhi Institute of Science and Technology

Chulalongkorn University

Publisher

Wiley

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3