Insights into the inner workings of transformer models for protein function prediction

Author:

Wenzel Markus1ORCID,Grüner Erik1,Strodthoff Nils2ORCID

Affiliation:

1. Department of Artificial Intelligence, Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institut, HHI , Einsteinufer 37 , 10587 Berlin, Germany

2. School VI - Medicine and Health Services, Carl von Ossietzky University of Oldenburg , Ammerländer Heerstr. 114-118 , 26129 Oldenburg, Germany

Abstract

Abstract Motivation We explored how explainable artificial intelligence (XAI) can help to shed light into the inner workings of neural networks for protein function prediction, by extending the widely used XAI method of integrated gradients such that latent representations inside of transformer models, which were finetuned to Gene Ontology term and Enzyme Commission number prediction, can be inspected too. Results The approach enabled us to identify amino acids in the sequences that the transformers pay particular attention to, and to show that these relevant sequence parts reflect expectations from biology and chemistry, both in the embedding layer and inside of the model, where we identified transformer heads with a statistically significant correspondence of attribution maps with ground truth sequence annotations (e.g. transmembrane regions, active sites) across many proteins. Availability and Implementation Source code can be accessed at https://github.com/markuswenzel/xai-proteins.

Funder

Bundesministerium für Bildung und Forschung

BIFOLD—Berlin Institute for the Foundations of Learning and Data

Publisher

Oxford University Press (OUP)

Reference112 articles.

1. Sanity checks for saliency maps;Adebayo;Adv. neural inf. process. syst,2018

2. Unified rational protein engineering with sequence-based deep representation learning;Alley;Nat. Methods,2019

3. Machine learning in protein structure prediction;AlQuraishi;Curr. Opin. Chem. Biol,2021

4. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI;Arrieta;Information fusion,2020

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3