DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays-Reference-Cited by-同舟云学术

DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays

Published:2021-07-01 Issue:Supplement_1 Volume:37 Page:i280-i288
ISSN:1367-4803
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Chen Zhanlin¹,Zhang Jing²,Liu Jason³,Dai Yi²,Lee Donghoon⁴,Min Martin Renqiang⁵,Xu Min⁶,Gerstein Mark¹³⁷

Affiliation:

1. Department of Statistics & Data Science, Yale University, New Haven, CT 06520, USA

2. Department of Computer Science, University of California, Irvine, CA 92617, USA

3. Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA

4. Genetics and Genomic Sciences, The Icahn School of Medicine at Mount Sinai, New York, NY 10029-6574, USA

5. NEC Laboratories America, Princeton, NJ 08540, USA

6. Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA

7. Department of Computer Science, Yale University, New Haven, CT 06520, USA

Abstract

Abstract Motivation Mapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches have implemented enhancer discovery as a binary classification problem without accurate boundary detection, producing low-resolution annotations with superfluous regions and reducing the statistical power for downstream analyses (e.g. causal variant mapping and functional validations). Here, we addressed these challenges via a two-step model called Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays (DECODE). First, we employed direct enhancer-activity readouts from novel functional characterization assays, such as STARR-seq, to train a deep neural network for accurate cell-type-specific enhancer prediction. Second, to improve the annotation resolution, we implemented a weakly supervised object detection framework for enhancer localization with precise boundary detection (to a 10 bp resolution) using Gradient-weighted Class Activation Mapping. Results Our DECODE binary classifier outperformed a state-of-the-art enhancer prediction method by 24% in transgenic mouse validation. Furthermore, the object detection framework can condense enhancer annotations to only 13% of their original size, and these compact annotations have significantly higher conservation scores and genome-wide association study variant enrichments than the original predictions. Overall, DECODE is an effective tool for enhancer classification and precise localization. Availability and implementation DECODE source code and pre-processing scripts are available at decode.gersteinlab.org. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

NIMH

National Institutes of Health

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

http://academic.oup.com/bioinformatics/article-pdf/37/Supplement_1/i280/39620314/btab283.pdf

Reference43 articles.

1. Toward a gold standard for promoter prediction evaluation;Abeel;Bioinformatics,2009

2. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning;Alipanahi;Nat. Biotechnol,2015

3. Comparative analysis of regulatory information and circuits across distant species;Boyle;Nature,2014

4. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies;Bulik-Sullivan;Nat. Genet,2015

5. Pan-cancer analysis of whole genomes;Campbell;Nature,2020

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A deep learning model for DNA enhancer prediction based on nucleotide position aware feature encoding;iScience;2024-06

2. Validation of Enhancer Regions in Primary Human Neural Progenitor Cells using Capture STARR-seq;2024-03-18

3. Pig-eRNAdb: a comprehensive enhancer and eRNA dataset of pigs;Scientific Data;2024-02-01

4. Integrative approaches based on genomic techniques in the functional studies on enhancers;Briefings in Bioinformatics;2023-11-22

5. Computational methods for identifying enhancer‐promoter interactions;Quantitative Biology;2023-06