Abstract
AbstractTranscriptome-wide association study (TWAS) is an emerging model leveraging gene expressions to direct genotype-phenotype association mapping. A key component in TWAS is the prediction of gene expressions; and many statistical approaches have been developed along this line. However, a problem is that many genes have low expression heritability, limiting the performance of any predictive model. In this work, hypothesizing that appropriate denoising may improve the quality of expression data (including heritability), we propose AE-TWAS, which adds a transformation step before conducting standard TWAS. The transformation is composed of two steps by first splitting the whole transcriptome into co-expression networks (modules) and then using autoencoder (AE) to reconstruct the transcriptome data within each module. This transformation removes noise (including nonlinear ones) from the transcriptome data, paving the path for downstream TWAS. We applied AE-TWAS to the GTEx whole blood transcriptome data and GWAS data of five human diseases, showing two inspiring properties of AE-TWAS: (1) After transformation, the transcriptome data enjoy higher expression heritability at the low-heritability spectrum and possess higher connectivity within the modules. (2) The transferred transcriptome indeed enables better performance of TWAS; and moreover, the newly formed highly connected genes (i.e., hub genes) are more functionally relevant to diseases, evidenced by their functional annotations and overlap with TWAS hits. Taking together, we show that autoencoder transformation produces “better” transcriptome, which in turn enables improved expression-assisted genotype-phenotype association mapping. The impact of this work may be beyond the field of gene mapping: AE can be deemed as a nonlinear extension of principal component analysis (PCA) that is used for removing artifacts in expression data routinely. As such, this work may inspire more expression-based applications to be carried out after an appropriate AE-transformation, unlocking the use of AE-denoised transcriptome in many fields.Author SummaryWe propose to use autoencoder (AE) transformed expression data in transcriptome-wide association studies (TWAS). Currently, TWAS studies have restricted power because of the limiting performance of predictive models. This is largely due to the low expression heritability of many genes in expression data of the reference data set. The quality of expression data (including heritability) is essential, so we aim to develop a method using appropriate denoising. Unlike past studies, which only used linear methods such as principal component analysis (PCA) to remove confounders in expression, we incorporate autoencoder (AE), a nonlinear extension of PCA to remove artifacts in expression data. Our new method, AE-TWAS, is a two-step process. First, we group highly correlated genes in transcriptome into co-expression networks (modules), and then use AE to recreate the transcriptome data within each module. Secondly, the transformed transcriptome is used for downstream TWAS. When we applied AE-TWAS to real diseases, two inspiring discoveries emerged: (1) AE-TWAS boosted the heritability of low-heritability genes in transcriptome data after AE transformation. (2) The transferred transcriptome led to better performance of TWAS and were functionally more relevant to diseases. Our work unlocks the use of AE-denoised transcriptome in more expression-based applications.
Publisher
Cold Spring Harbor Laboratory
Reference40 articles.
1. Integrative approaches for large-scale transcriptome-wide association studies;Nat Genet [Internet],2016
2. A gene-based association method for mapping traits using reference transcriptome data;Nature Genetics 2015 47:9 [Internet],2015
3. Polygenic Modeling with Bayesian Sparse Linear Mixed Models;PLoS Genet [Internet],2013
4. A transcriptome-wide association study identifies PALMD as a susceptibility gene for calcific aortic valve stenosis;Nature Communications 2018 9:1 [Internet],2018
5. Large-scale transcriptome-wide association study identifies new prostate cancer risk regions;Nature Communications 2018 9:1 [Internet],2018