AttOmics: attention-based architecture for diagnosis and prognosis from omics data

Author:

Beaude Aurélien12ORCID,Rafiee Vahid Milad3ORCID,Augé Franck2ORCID,Zehraoui Farida1ORCID,Hanczar Blaise1ORCID

Affiliation:

1. IBISC, Université Paris-Saclay, Univ Evry , 23 Boulevard de France , Evry-Courcouronnes 91020, France

2. Artificial Intelligence & Deep Analytics, Omics Data Science, Sanofi R&D Data and Data Science , 1 Av. Pierre Brossolette , Chilly-Mazarin 91385, France

3. Sanofi R&D Data and Data Science, Artificial Intelligence & Deep Analytics, Omics Data Science , 450 Water Street, Cambridge, MA 02142, United States

Abstract

Abstract Motivation The increasing availability of high-throughput omics data allows for considering a new medicine centered on individual patients. Precision medicine relies on exploiting these high-throughput data with machine-learning models, especially the ones based on deep-learning approaches, to improve diagnosis. Due to the high-dimensional small-sample nature of omics data, current deep-learning models end up with many parameters and have to be fitted with a limited training set. Furthermore, interactions between molecular entities inside an omics profile are not patient specific but are the same for all patients. Results In this article, we propose AttOmics, a new deep-learning architecture based on the self-attention mechanism. First, we decompose each omics profile into a set of groups, where each group contains related features. Then, by applying the self-attention mechanism to the set of groups, we can capture the different interactions specific to a patient. The results of different experiments carried out in this article show that our model can accurately predict the phenotype of a patient with fewer parameters than deep neural networks. Visualizing the attention maps can provide new insights into the essential groups for a particular phenotype. Availability and implementation The code and data are available at https://forge.ibisc.univ-evry.fr/abeaude/AttOmics. TCGA data can be downloaded from the Genomic Data Commons Data Portal.

Funder

public–private partnership

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference38 articles.

1. Controlling the false discovery rate: a practical and powerful approach to multiple testing;Benjamini;J R Stat Soc Ser B Methodol,1995

2. DeepTRIAGE: interpretable and individualised biomarker scores using attention mechanism for the classification of breast cancer sub-types;Beykikhoshk;BMC Med Genomics,2020

3. Deep GONet: self-explainable deep neural network based on gene ontology for phenotype prediction from gene expression data;Bourgeais;BMC Bioinformatics,2021

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3