Prediction of interactions between cell surface proteins by machine learning

Author:

Su Zhaoqian1,Griffin Brian2,Emmons Scott2,Wu Yinghao1

Affiliation:

1. Department of Systems and Computational Biology Albert Einstein College of Medicine Bronx New York USA

2. Department of Genetics Albert Einstein College of Medicine Bronx New York USA

Abstract

AbstractCells detect changes in their external environments or communicate with each other through proteins on their surfaces. These cell surface proteins form a complicated network of interactions in order to fulfill their functions. The interactions between cell surface proteins are highly dynamic and, thus, challenging to detect using traditional experimental techniques. Here, we tackle this challenge using a computational framework. The primary focus of the framework is to develop new tools to identify interactions between domains in the immunoglobulin (Ig) fold, which is the most abundant domain family in cell surface proteins. These interactions could be formed between ligands and receptors from different cells or between proteins on the same cell surface. In practice, we collected all structural data on Ig domain interactions and transformed them into an interface fragment pair library. A high‐dimensional profile can then be constructed from the library for a given pair of query protein sequences. Multiple machine learning models were used to read this profile so that the probability of interaction between the query proteins could be predicted. We tested our models on an experimentally derived dataset that contains 564 cell surface proteins in humans. The cross‐validation results show that we can achieve higher than 70% accuracy in identifying the PPIs within this dataset. We then applied this method to a group of 46 cell surface proteins in Caenorhabditis elegans. We screened every possible interaction between these proteins. Many interactions recognized by our machine learning classifiers have been experimentally confirmed in the literature. In conclusion, our computational platform serves as a useful tool to help identify potential new interactions between cell surface proteins in addition to current state‐of‐the‐art experimental techniques. The tool is freely accessible for use by the scientific community. Moreover, the general framework of the machine learning classification can also be extended to study the interactions of proteins in other domain superfamilies.

Funder

National Institute of General Medical Sciences

National Institutes of Health

Albert Einstein College of Medicine, Yeshiva University

Publisher

Wiley

Subject

Molecular Biology,Biochemistry,Structural Biology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3