Abstract
AbstractArabidopsis thalianaCol-0 has plastid and mitochondrial genomes encoding for over one hundred proteins and several ORFs. Public databases (e.g.Araport11) have redundancy and discrepancies in gene identifiers for these organelle-encoded proteins. RNA editing results in changes to specific amino acid residues or creation of start and stop codons for many of these proteins, but the impact of such RNA editing at the protein level is largely unexplored due to the complexities of detection. This study first assembled the non-redundant set of identifiers, their correct protein sequences, and 452 predicted non-synonymous editing sites of which 56 are edited at lower frequency. Accumulation of edited and/or unedited proteoforms was then determined by searching ∼259 million raw MSMS spectra from ProteomeXchange as part of Arabidopsis PeptideAtlas (www.peptideatlas.org/builds/arabidopsis/). All mitochondrial proteins and all except three plastid-encoded proteins (NDHG/NDH6, PSBM, RPS16), but none of the ORFs, were identified; we suggest that all ORFs and RPS16 are pseudogenes. Detection frequencies for each edit site and type of edit (e.g.S to L/F) were determined at the protein level, cross-referenced against the metadata (e.g.tissue), and evaluated for technical challenges of detection.167 predicted edit sites were detected at the proteome level. Minor frequency sites were indeed also edited at low frequency at the protein level. However, except for sites RPL5-22 and CCB382-124, proteins only accumulate in edited form (>98 –100% edited) even if RNA editing levels are well below 100%. This study establishes that RNA editing for major editing sites is required for stable protein accumulation.
Publisher
Cold Spring Harbor Laboratory