Using attentive gated neural networks to quantify the impact of non-coding variants on transcription factor binding affinity-Reference-Cited by-同舟云学术

Using attentive gated neural networks to quantify the impact of non-coding variants on transcription factor binding affinity

Published:2021-08-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Patel Neel,Bai Haimeng,Bush William S.^ORCID

Abstract

AbstractA large proportion of non-coding variants are present within binding sites of transcription factors(TFs), which play a significant role in gene regulation. Thus, deriving the impact of non-coding variants on TF binding is the first step towards unravelling their regulatory roles within their associated disease traits. Most of the modern algorithms used for this purpose are based on convolutional neural network(CNN) architectures. However, these models are incapable of capturing the positional effect of different sub-sequences within the TF binding sites on the binding affinity. In this paper, we utilize the attentive gated neural network(AGNet) architecture to build a set of TF-AGNet models for predicting in vivo TF binding intensities in the GM12878 lymphoblastoid cells. These models have novel layers capable of deriving the impact of relative positions of different DNA sub-sequences, within a binding site, on TF binding affinity, and of extracting the most relevant prediction features. We show that the TF-AGNet models are able to outperform conventional CNNs for predicting continuous values of TF binding affinity. We also train additional TF-AGNet models for 20 TFs using data from 4 other cell-lines to assess the generalizability of their prediction accuracy. Lastly, we show that the TF-AGNet based models more accurately classify non-coding variants that significantly affect TF binding compared to models based on 7 variant annotation tools. This accuracy can be leveraged to derive gene regulatory roles of millions of non-coding variants across the genome to further examine their mechanistic associations with complex disease traits.

Publisher

Cold Spring Harbor Laboratory

Reference26 articles.

1. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019

2. A brief history of human disease genetics

3. Detection of Regulatory SNPs in Human Genome Using ChIP-seq ENCODE Data

4. Genetic and epigenetic fine mapping of causal autoimmune disease variants

5. Systematic Localization of Common Disease-Associated Variation in Regulatory DNA