Detecting Protein Function and Protein-Protein Interactions from Genome Sequences

Author:

Marcotte Edward M.1,Pellegrini Matteo1,Ng Ho-Leung1,Rice Danny W.1,Yeates Todd O.1,Eisenberg David1

Affiliation:

1. UCLA–Department of Energy Laboratory of Structural Biology and Molecular Medicine, Departments of Chemistry and Biochemistry and Biological Chemistry, Box 951570, University of California at Los Angeles, Los Angeles, CA 90095–1570, USA.

Abstract

A computational method is proposed for inferring protein interactions from genome sequences on the basis of the observation that some pairs of interacting proteins have homologs in another organism fused into a single protein chain. Searching sequences from many genomes revealed 6809 such putative protein-protein interactions in Escherichia coli and 45,502 in yeast. Many members of these pairs were confirmed as functionally related; computational filtering further enriches for interactions. Some proteins have links to several other proteins; these coupled links appear to represent functional interactions such as complexes or pathways. Experimentally confirmed interacting pairs are documented in a Database of Interacting Proteins.

Publisher

American Association for the Advancement of Science (AAAS)

Subject

Multidisciplinary

Reference30 articles.

1. B. Alberts et al. Molecular Biology of the Cell (Garland New York ed. 3 1994); H. Lodish et al. Molecular Cell Biology (Scientific American Books New York ed. 3 1995).

2. Fields S., Song O. K., Nature 340, 243 (1989).

3. Berger J. M. Gamblin S. J. Harrison S. C. Wang J. C. 379 225 (1996).

4. The Complete Genome Sequence of Escherichia coli K-12

5. The triplets of proteins are found with the aid of protein domain databases such as the ProDom or Pfam databases (17). Here a list of all ProDom domains in every one of the 64 568 SWISS-PROT proteins was prepared as well as a list of all proteins that contain each of the 53 597 ProDom domains. Then every protein in ProDom was considered for its ability to be a linking (or Rosetta Stone) member in a triplet. All pairs of domains that are both members of a given protein P were defined as being linked by protein P if we could find at least one protein with only one of the two domains. By this method we found 14 899 links between the 7843 ProDom domains. Then in a single genome (such as E. coli ) we found all nonhomologous pairs of proteins containing linked domains. These pairs are linked by the Rosetta Stone proteins. For E. coli this method finds 3531 protein pairs. An alternate method for discovering protein triplets uses amino acid sequence alignment techniques to find two proteins that align to a Rosetta Stone protein such that the alignments do not overlap on the Rosetta Stone protein. For E. coli this method finds 4487 protein pairs 1209 of which were also found by the ProDom search method (even though different sequence databases were searched for each method). All predictions are available on the World Wide Web at www.doe-mbi.ucla.edu.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3