Enhanced protein isoform characterization through long-read proteogenomics-Reference-Cited by-同舟云学术

Enhanced protein isoform characterization through long-read proteogenomics

Published:2022-03-03 Issue:1 Volume:23 Page:
ISSN:1474-760X
Container-title:Genome Biology
language:en
Short-container-title:Genome Biol

Author:

Miller Rachel M.,Jordan Ben T.,Mehlferber Madison M.,Jeffery Erin D.,Chatzipantsiou Christina,Kaur Simi,Millikin Robert J.,Dai Yunxiang,Tiberi Simone,Castaldi Peter J.,Shortreed Michael R.,Luckey Chance John,Conesa Ana,Smith Lloyd M.,Deslattes Mays Anne,Sheynkman Gloria M.^ORCID

Abstract

Abstract Background The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms. Results We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis. Conclusions Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.

Funder

national institute of general medical sciences

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1186/s13059-022-02624-y.pdf

Reference93 articles.

1. Mann M, Kulak NA, Nagaraj N, Cox J. The coming age of complete, accurate, and ubiquitous proteomes. Mol Cell. 2013;49:583–90.

2. Tapial J, Ha KCH, Sterne-Weiler T, Gohr A, Braunschweig U, Hermoso-Pulido A, et al. An atlas of alternative splicing profiles and functional associations reveals new regulatory programs and genes that simultaneously express multiple major isoforms. Genome Res. 2017;27:1759–68.

3. Kelemen O, Convertini P, Zhang Z, Wen Y, Shen M, Falaleeva M, et al. Function of alternative splicing. Gene. 2013;514:1–30.

4. Yang X, Coulombe-Huntington J, Kang S, Sheynkman GM, Hao T, Richardson A, et al. Widespread expansion of protein interaction capabilities by alternative splicing. Cell. 2016;164:805–17.

5. Cooper TA, Wan L, Dreyfuss G. RNA and disease. Cell. 2009;136:777–93.

Cited by 41 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Long-read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors of disease;The American Journal of Human Genetics;2024-09

2. Gene expression and alternative splicing analysis in a large-scale Multiple Sclerosis study;2024-08-16

3. Long-read transcript sequencing identifies differential isoform expression in the entorhinal cortex in a transgenic model of tau pathology;Nature Communications;2024-08-02

4. Multi-omic profiling of pathogen-stimulated primary immune cells;iScience;2024-08

5. IS-PRM-Based Peptide Targeting Informed by Long-Read Sequencing for Alternative Proteome Detection;Journal of the American Society for Mass Spectrometry;2024-07-16