Author:
Zhang Huiling,Ju Zhen,Zhang Jingjing,Li Xijian,Xiao Hanyang,Chen Xiaochuan,li Yuetong,Wang Xinran,Wei Yanjie
Abstract
AbstractAllosteric regulation that triggers the functional activity of a protein through conformational changes is an inherent function of the protein in numerous physiological and pathological scenarios. In the post-genomic era, a central challenge for disease genomes is the identification of the biological effects of specific somatic variants on allosteric proteins and the phenotypes they influence during the initiation and progression of diseases. Here, we analyzed more than 38539 mutations observed in 90 human genes with 740 allosteric protein chains. We found that existing allosteric protein mutations are associated with many diseases, but the clinical significance of the majority of mutations in allosteric proteins remains unclear. Next, we developed a machine-learning-based model for pathogenic mutation prediction of allosteric proteins based on the intrinsic characteristics of proteins and the prediction results from existed methods. When tested on the benchmark allosteric protein dataset, the proposed method achieves AUCs of 0.868 and AUPR of 0.894 on allosteric proteins. Furthermore, we explored the performance of existing methods in predicting the pathogenicity of mutations at allosteric sites and identified potential significant pathogenic mutations at allosteric sites using the proposed method. In summary, these findings illuminate the significance of allosteric mutation in disease processes, and contribute a valuable tool for the identification of pathogenic mutations as well as previously unknown disease-causing allosteric-protein-encoded genes.
Publisher
Cold Spring Harbor Laboratory