Affiliation:
1. Department of Biochemistry, All India Institute of Medical Sciences, New Delhi, India
Abstract
An RNA G-quadruplex in the protein coding segment of mRNA is translatable [Formula: see text] and may potentially impact protein translation. This can be consequent to staggered ribosomal synthesis and/or result in an increased frequency of missense translational events. A mathematical model of the peptides that encompass the substituted amino acids, ie, the [Formula: see text]-mapped peptidome, has been previously studied. However, the significance and relevance to disease biology of this model remains to be established. ProTG4 computes a confidence-of-sequence-identity [Formula: see text]-score, which is the average weighted length of every matched [Formula: see text]-mapped peptide in a generic protein sequence. The weighted length is the product of the length of the peptide and the probability of its non-random occurrence in a library of randomly generated sequences of equivalent lengths. This is then averaged over the entire length of the protein sequence. ProTG4 is simple to operate, has clear instructions, and is accompanied by a set of ready-to-use examples. The rationale of the study, algorithms deployed, and the computational pipeline deployed are also part of the web page. Analyses by ProTG4 of taxonomically diverse protein sequences suggest that there is significant homology to [Formula: see text]-mapped peptides. These findings, especially in potentially infectious and infesting agents, offer plausible explanations into the aetiology and pathogenesis of certain proteopathies. ProTG4 can also provide a quantitative measure to identify and annotate the canonical form of a generic protein sequence from its known isoforms. The article presents several case studies and discusses the relevance of ProTG4-assisted peptide analysis in gaining insights into various mechanisms of disease biology (mistranslation, alternate splicing, amino acid substitutions).
Funder
all-india institute of medical sciences
Subject
Applied Mathematics,Computational Mathematics,Computer Science Applications,Molecular Biology,Biochemistry