Autoencoder-transformed transcriptome improves genotype-phenotype association studies

Author:

Bian JiayiORCID,Li Qing,Leung Albert,Yang Guotao,Yan Jun,Wu Jingjing,Long Quan

Abstract

AbstractTranscriptome-wide association study (TWAS) is an emerging model leveraging gene expressions to direct genotype-phenotype association mapping. A key component in TWAS is the prediction of gene expressions; and many statistical approaches have been developed along this line. However, a problem is that many genes have low expression heritability, limiting the performance of any predictive model. In this work, hypothesizing that appropriate denoising may improve the quality of expression data (including heritability), we propose AE-TWAS, which adds a transformation step before conducting standard TWAS. The transformation is composed of two steps by first splitting the whole transcriptome into co-expression networks (modules) and then using autoencoder (AE) to reconstruct the transcriptome data within each module. This transformation removes noise (including nonlinear ones) from the transcriptome data, paving the path for downstream TWAS. We applied AE-TWAS to the GTEx whole blood transcriptome data and GWAS data of five human diseases, showing two inspiring properties of AE-TWAS: (1) After transformation, the transcriptome data enjoy higher expression heritability at the low-heritability spectrum and possess higher connectivity within the modules. (2) The transferred transcriptome indeed enables better performance of TWAS; and moreover, the newly formed highly connected genes (i.e., hub genes) are more functionally relevant to diseases, evidenced by their functional annotations and overlap with TWAS hits. Taking together, we show that autoencoder transformation produces “better” transcriptome, which in turn enables improved expression-assisted genotype-phenotype association mapping. The impact of this work may be beyond the field of gene mapping: AE can be deemed as a nonlinear extension of principal component analysis (PCA) that is used for removing artifacts in expression data routinely. As such, this work may inspire more expression-based applications to be carried out after an appropriate AE-transformation, unlocking the use of AE-denoised transcriptome in many fields.Author SummaryWe propose to use autoencoder (AE) transformed expression data in transcriptome-wide association studies (TWAS). Currently, TWAS studies have restricted power because of the limiting performance of predictive models. This is largely due to the low expression heritability of many genes in expression data of the reference data set. The quality of expression data (including heritability) is essential, so we aim to develop a method using appropriate denoising. Unlike past studies, which only used linear methods such as principal component analysis (PCA) to remove confounders in expression, we incorporate autoencoder (AE), a nonlinear extension of PCA to remove artifacts in expression data. Our new method, AE-TWAS, is a two-step process. First, we group highly correlated genes in transcriptome into co-expression networks (modules), and then use AE to recreate the transcriptome data within each module. Secondly, the transformed transcriptome is used for downstream TWAS. When we applied AE-TWAS to real diseases, two inspiring discoveries emerged: (1) AE-TWAS boosted the heritability of low-heritability genes in transcriptome data after AE transformation. (2) The transferred transcriptome led to better performance of TWAS and were functionally more relevant to diseases. Our work unlocks the use of AE-denoised transcriptome in more expression-based applications.

Publisher

Cold Spring Harbor Laboratory

Reference40 articles.

1. Integrative approaches for large-scale transcriptome-wide association studies;Nat Genet [Internet],2016

2. A gene-based association method for mapping traits using reference transcriptome data;Nature Genetics 2015 47:9 [Internet],2015

3. Polygenic Modeling with Bayesian Sparse Linear Mixed Models;PLoS Genet [Internet],2013

4. A transcriptome-wide association study identifies PALMD as a susceptibility gene for calcific aortic valve stenosis;Nature Communications 2018 9:1 [Internet],2018

5. Large-scale transcriptome-wide association study identifies new prostate cancer risk regions;Nature Communications 2018 9:1 [Internet],2018

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3