Affiliation:
1. Barcelona Supercomputing Center, 08034 Barcelona, Spain
2. Computer Architecture Department, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain
3. Institut Català de Recerca i Estudis Avançats, 08010 Barcelona, Spain
Abstract
One of the main goals of human genetics is to understand the connections between genomic variation and the predisposition to develop a complex disorder. These disease–variant associations are usually studied in a single independent manner, disregarding the possible effect derived from the interaction between genomic variants. In particular, in a background of complex diseases, these interactions can be directly linked to the disorder and may play an important role in disease development. Although their study has been suggested to help complete the understanding of the genetic bases of complex diseases, this still represents a big challenge due to large computing demands. Here, we take advantage of high-performance computing technologies to tackle this problem by using a combination of machine learning methods and statistical approaches. As a result, we created a containerized framework that uses multifactor dimensionality reduction (MDR) to detect pairs of variants associated with type 2 diabetes (T2D). This methodology was tested on the Northwestern University NUgene project cohort using a dataset of 1,883,192 variant pairs with a certain degree of association with T2D. Out of the pairs studied, we identified 104 significant pairs: two of which exhibit a potential functional relationship with T2D. These results place the proposed MDR method as a valid, efficient, and portable solution to study variant interaction in real reduced genomic datasets.
Funder
European Commission
Universitat Politècnica de Catalunya
Generalitat de Catalunya
Spanish Ministry of Science
Reference55 articles.
1. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program;Taliun;Nature,2021
2. Genome-wide association studies;Uffelmann;Nat. Rev. Methods Prim.,2021
3. Five years of GWAS discovery;Visscher;Am. J. Hum. Genet.,2012
4. Hayes, B. (2013). Overview of statistical methods for genome-wide association studies (GWAS). Genome-Wide Association Studies and Genomic Prediction, Springer.
5. Alonso, L., Morán, I., Salvoro, C., and Torrents, D. (2021). In Search of Complex Disease Risk through Genome Wide Association Studies. Mathematics, 9.