Shotgun proteomics aids discovery of novel protein-coding genes, alternative splicing, and “resurrected” pseudogenes in the mouse genome-Reference-Cited by-同舟云学术

Shotgun proteomics aids discovery of novel protein-coding genes, alternative splicing, and “resurrected” pseudogenes in the mouse genome

Published:2011-04-01 Issue:5 Volume:21 Page:756-767
ISSN:1088-9051
Container-title:Genome Research
language:en
Short-container-title:Genome Res.

Author:

Brosch Markus,Saunders Gary I.,Frankish Adam,Collins Mark O.,Yu Lu,Wright James,Verstraten Ruth,Adams David J.,Harrow Jennifer,Choudhary Jyoti S.,Hubbard Tim

Abstract

Recent advances in proteomic mass spectrometry (MS) offer the chance to marry high-throughput peptide sequencing to transcript models, allowing the validation, refinement, and identification of new protein-coding loci. We present a novel pipeline that integrates highly sensitive and statistically robust peptide spectrum matching with genome-wide protein-coding predictions to perform large-scale gene validation and discovery in the mouse genome for the first time. In searching an excess of 10 million spectra, we have been able to validate 32%, 17%, and 7% of all protein-coding genes, exons, and splice boundaries, respectively. Moreover, we present strong evidence for the identification of multiple alternatively spliced translations from 53 genes and have uncovered 10 entirely novel protein-coding genes, which are not covered in any mouse annotation data sources. One such novel protein-coding gene is a fusion protein that spans the Ins2 and Igf2 loci to produce a transcript encoding the insulin II and the insulin-like growth factor 2–derived peptides. We also report nine processed pseudogenes that have unique peptide hits, demonstrating, for the first time, that they are not just transcribed but are translated and are therefore resurrected into new coding loci. This work not only highlights an important utility for MS data in genome annotation but also provides unique insights into the gene structure and propagation in the mouse genome. All these data have been subsequently used to improve the publicly available mouse annotation available in both the Vega and Ensembl genome browsers (http://vega.sanger.ac.uk).

Publisher

Cold Spring Harbor Laboratory

Subject

Genetics (clinical),Genetics

Reference74 articles.

1. Mouse project to find each gene's role

2. Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome

3. The Vertebrate Genome Annotation (Vega) database

4. Current topics in genome evolution: Molecular mechanisms of new gene formation

5. Retrocopy contributions to the evolution of the human genome

Cited by 111 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The Pseudogene RPS27AP5 Reveals Novel Ubiquitin and Ribosomal Protein Variants Involved in Specialised Ribosomal Functions;2024-02-09

2. Four classic “de novo” genes all have plausible homologs and likely evolved from retro-duplicated or pseudogenic sequences;Molecular Genetics and Genomics;2024-02-05

3. Identification of potential pseudogenes for predicting the prognosis of hepatocellular carcinoma;Journal of Cancer Research and Clinical Oncology;2023-08-09

4. Identification of potential pseudogenes for predicting the prognosis of hepatocellular carcinoma;2023-07-17

5. Four classic “de novo” genes all have plausible homologs and likely evolved from retro-duplicated or pseudogenic sequences;2023-05-30