Author:
Graça Miguel,Nobre Ricardo,Sousa Leonel,Ilic Aleksandar
Abstract
AbstractUnderstanding the genetic basis of complex diseases is one of the most important challenges in current precision medicine. To this end, Genome-Wide Association Studies aim to correlate Single Nucleotide Polymorphisms (SNPs) to the presence or absence of certain traits. However, these studies do not consider interactions between several SNPs, known as epistasis, which explain most genetic diseases. Analyzing SNP combinations to detect epistasis is a major computational task, due to the enormous search space. A possible solution is to employ deep learning strategies for genomic prediction, but the lack of explainability derived from the black-box nature of neural networks is a challenge yet to be addressed. Herein, a novel, flexible, portable, and scalable framework for network interpretation based on transformers is proposed to tackle any-order epistasis. The results on various epistasis scenarios show that the proposed framework outperforms state-of-the-art methods for explainability, while being scalable to large datasets and portable to various deep learning accelerators. The proposed framework is validated on three WTCCC datasets, identifying SNPs related to genes known in the literature that have direct relationships with the studied diseases.
Funder
Fundação para a Ciência e a Tecnologia
Publisher
Springer Science and Business Media LLC