SMAP is a pipeline for sample matching in proteogenomics-Reference-Cited by-同舟云学术

SMAP is a pipeline for sample matching in proteogenomics

Published:2022-02-08 Issue:1 Volume:13 Page:
ISSN:2041-1723
Container-title:Nature Communications
language:en
Short-container-title:Nat Commun

Author:

Li Ling^ORCID,Niu Mingming^ORCID,Erickson Alyssa^ORCID,Luo Jie,Rowbotham Kincaid,Guo Kai,Huang He,Li Yuxin^ORCID,Jiang Yi^ORCID,Hur Junguk^ORCID,Liu Chunyu,Peng Junmin^ORCID,Wang Xusheng^ORCID

Abstract

AbstractThe integration of genomics and proteomics data (proteogenomics) holds the promise of furthering the in-depth understanding of human disease. However, sample mix-up is a pervasive problem in proteogenomics because of the complexity of sample processing. Here, we present a pipeline for Sample Matching in Proteogenomics (SMAP) to verify sample identity and ensure data integrity. SMAP infers sample-dependent protein-coding variants from quantitative mass spectrometry (MS), and aligns the MS-based proteomic samples with genomic samples by two discriminant scores. Theoretical analysis with simulated data indicates that SMAP is capable of uniquely matching proteomic and genomic samples when ≥20% genotypes of individual samples are available. When SMAP was applied to a large-scale dataset generated by the PsychENCODE BrainGVEX project, 54 samples (19%) were corrected. The correction was further confirmed by ribosome profiling and chromatin sequencing (ATAC-seq) data from the same set of samples. Our results demonstrate that SMAP is an effective tool for sample verification in a large-scale MS-based proteogenomics study. SMAP is publicly available at https://github.com/UND-Wanglab/SMAP, and a web-based version can be accessed at https://smap.shinyapps.io/smap/.

Publisher

Springer Science and Business Media LLC

Subject

General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry,Multidisciplinary

Link

https://www.nature.com/articles/s41467-022-28411-8.pdf

Reference27 articles.

1. Weinstein, J. N. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).

2. Zhang, H. et al. Integrated proteogenomic characterization of human high-grade serous ovarian cancer. Cell 166, 755–765 (2016).

3. Zhang, B. et al. Proteogenomic characterization of human colon and rectal cancer. Nature 513, 382–387 (2014).

4. Vasaikar, S. et al. Proteogenomic analysis of human colon cancer reveals new therapeutic opportunities. Cell 177, 1035–1049 (2019). e1019.

5. Mertins, P. et al. Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 534, 55–62 (2016).

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Genetic regulation of human brain proteome reveals proteins implicated in psychiatric disorders;Molecular Psychiatry;2024-05-09

2. Human brain aging heterogeneity observed from multi-region omics data reveals a subtype closely related to Alzheimer’s disease;2024-03-05

3. Genetic Modulation of Protein Expression in Rat Brain;2024-02-21

4. Multi-omic atlas of the parahippocampal gyrus in Alzheimer’s disease;Scientific Data;2023-09-08