Protein prediction models support widespread post-transcriptional regulation of protein abundance by interacting partners-Reference-Cited by-同舟云学术

Protein prediction models support widespread post-transcriptional regulation of protein abundance by interacting partners

Published:2022-11-10 Issue:11 Volume:18 Page:e1010702
ISSN:1553-7358
Container-title:PLOS Computational Biology
language:en
Short-container-title:PLoS Comput Biol

Author:

Srivastava Himangi^ORCID,Lippincott Michael J.,Currie Jordan^ORCID,Canfield Robert^ORCID,Lam Maggie P. Y.,Lau Edward^ORCID

Abstract

Protein and mRNA levels correlate only moderately. The availability of proteogenomics data sets with protein and transcript measurements from matching samples is providing new opportunities to assess the degree to which protein levels in a system can be predicted from mRNA information. Here we examined the contributions of input features in protein abundance prediction models. Using large proteogenomics data from 8 cancer types within the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data set, we trained models to predict the abundance of over 13,000 proteins using matching transcriptome data from up to 958 tumor or normal adjacent tissue samples each, and compared predictive performances across algorithms, data set sizes, and input features. Over one-third of proteins (4,648) showed relatively poor predictability (elastic net r ≤ 0.3) from their cognate transcripts. Moreover, we found widespread occurrences where the abundance of a protein is considerably less well explained by its own cognate transcript level than that of one or more trans locus transcripts. The incorporation of additional trans-locus transcript abundance data as input features increasingly improved the ability to predict sample protein abundance. Transcripts that contribute to non-cognate protein abundance primarily involve those encoding known or predicted interaction partners of the protein of interest, including not only large multi-protein complexes as previously shown, but also small stable complexes in the proteome with only one or few stable interacting partners. Network analysis further shows a complex proteome-wide interdependency of protein abundance on the transcript levels of multiple interacting partners. The predictive model analysis here therefore supports that protein-protein interaction including in small protein complexes exert post-transcriptional influence on proteome compositions more broadly than previously recognized. Moreover, the results suggest mRNA and protein co-expression analysis may have utility for finding gene interactions and predicting expression changes in biological systems.

Funder

NIH Office of the Director

National Heart, Lung, and Blood Institute

Publisher

Public Library of Science (PLoS)

Subject

Computational Theory and Mathematics,Cellular and Molecular Neuroscience,Genetics,Molecular Biology,Ecology,Modeling and Simulation,Ecology, Evolution, Behavior and Systematics

Reference64 articles.

1. Correlation between protein and mRNA abundance in yeast;SP Gygi;Mol Cell Biol,1999

2. On the Dependency of Cellular Protein Levels on mRNA Abundance;Y Liu;Cell,2016

3. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses;C Vogel;Nat Rev Genet,2012

4. Post-transcriptional regulation across human tissues;A Franks;PLoS Comput Biol,2017

5. Experimental reproducibility limits the correlation between mRNA and protein abundances in tumour proteomic profiles;SR Upadhya;Systems Biology,2021

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-dataset Integration and Residual Connections Improve Proteome Prediction from Transcriptomes using Deep Learning;2024-07-11

2. Gene expression response under thermal stress in two Hawaiian corals is dominated by ploidy and genotype;Ecology and Evolution;2024-07

3. Proteome‐wide association study using cis and trans variants and applied to blood cell and lipid‐related traits in the Women's Health Initiative study;Genetic Epidemiology;2024-06-28

4. A Ratiometric Catalog of Protein Isoform Shifts in the Cardiac Fetal Gene Program;2024-04-10

5. Interferons dominate damage and activity in juvenile scleroderma;Modern Rheumatology;2024-04-06