Mutation Pathogenicity Prediction by a Biology Based Explainable AI Multi-Modal Algorithm

Author:

Kellerman Raizy,Nayshool OmriORCID,Barel OrtalORCID,Paz Sharon,Amariglio NinetteORCID,Klang EyalORCID,Rechavi GideonORCID

Abstract

AbstractMost known pathogenic mutations occur in protein-coding regions of DNA and change the way proteins are made. Deciphering the protein structure therefore provides great insight into the molecular mechanisms underlying biological functions in human disease. While there have recently been major advances in the artificial intelligence-based prediction of protein structure, the determination of the biological and clinical relevance of specific mutations is not yet up to clinical standards. This challenge is of utmost medical importance when decisions, as critical as suggesting termination of pregnancy or recommending cancer-directed rational drugs, depend on the accuracy of prediction of the effect of the specific mutation. Currently, available tools are aiming to characterize the effect of a mutation on the functionality of the protein according to biochemical criteria, independent of the biological context. A specific change in protein structure can result either in loss of function (LOF) or gain-of-function (GOF) and the ability to identify the directionality of effect needs to be taken into consideration when interpreting the biological outcome of the mutation. Here we describeTriple-modalitiesVariant Interpretation andAnalysis (TriVIAI), a tool incorporating three complementing modalities for improved prediction of missense mutations pathogenicity: protein language model (pLM), graph neural network (GNN) and a tabular model incorporating physical properties from the protein structure. The TriVIAl ensemble’s predictions compare favorably with the existing tools across various metrics, achieving an AUC-ROC of 0.887, a precision-recall curve (PRC) score of 0.68, and a Brier score of 0.16. The TriVIAI ensemble is also endowed with two major advantages compared to other available tools. The first is the incorporation of biological insights which allow to differentiate between GOF mutations that tend to cluster in specific hotspots and affect structure in a specific functional way versus LOF mutations that are usually dispersed and can cripple the protein in a variety of different ways. Importantly, the advantage over other available tools is more noticeable with GOF mutations as their effect on the protein structure is less disruptive and can be misinterpreted by current variant prioritization strategies. Until now available AI-based pathogenicity predicting algorithms were a black box for the users. The second significant advantage of TriVIAI is the explainability of the ensemble which contrasts the other available AI-based pathogenicity predicting algorithms which constitute a black box for the users. This explainability feature is of major importance considering the clinical responsibility of the medical decision-makers using AI-based pathogenicity predictors.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3