M-Ionic: Prediction of metal ion binding sites from sequence using residue embeddings-Reference-Cited by-同舟云学术

M-Ionic: Prediction of metal ion binding sites from sequence using residue embeddings

Published:2023-04-06 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Shenoy Aditi,Kalakoti Yogesh,Sundar Durai,Elofsson Arne^ORCID

Abstract

AbstractMotivationUnderstanding metal-protein interaction can provide structural and functional insights into cellular processes. As the number of protein sequences increases, developing fast yet precise computational approaches to predict and annotate metal binding sites becomes imperative. Quick and resource-efficient pre-trained protein language model (PLM) embeddings have successfully predicted binding sites from protein sequences despite not using structural or evolutionary features (multiple sequence alignments). Using residue-level embeddings from the PLMs, we have developed a sequence-based method (M-Ionic) to identify metal-binding proteins and predict residues involved in metal-binding.ResultsOn independent validation of recent proteins, M-Ionic reports an area under the curve (AUROC) of 0.83 (recall=84.6%) in distinguishing metal-binding from non-binding proteins compared to AUROC of 0.74 (recall =61.8%) of the next best method. In addition to comparable performance to the state-of-the-art method for identifying metal-binding residues (Ca2+, Mg2+, Mn2+, Zn2+), M-Ionic provides binding probabilities for six additional ions (i.e., Cu2+, Po43-, So42-, Fe2+, Fe3+, Co2+). We show that the PLM embedding of a single residue contains sufficient information about its neighbours to predict its binding properties.Availability and ImplementationM-Ionic can be used on your protein of interest using a Google Colab Notebook (https://bit.ly/40FrRbK). GitHub repository (https://github.com/TeamSundar/m-ionic) contains all code and data.Contactarne@bioinfo.seSupplementary informationSupplementary data are available atBioinformaticsonline.

Publisher

Cold Spring Harbor Laboratory

Reference32 articles.

1. Adhikari,B. (2020) REALDIST: Real-valued protein distance prediction. 2020.11.28.402214.

2. Metalloproteomes: A Bioinformatic Approach

3. mebipred: identifying metal-binding potential in protein sequence;Bioinformatics,2022

4. Transition metal binding selectivity in proteins and its correlation with the phylogenomic classification of the cation diffusion facilitator protein family