A unified encyclopedia of human functional DNA elements through fully automated annotation of 164 human cell types-Reference-Cited by-同舟云学术

A unified encyclopedia of human functional DNA elements through fully automated annotation of 164 human cell types

Published:2016-11-07 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Libbrecht Maxwell W.^ORCID,Rodriguez Oscar,Weng Zhiping,Bilmes Jeffrey A.,Hoffman Michael M.,Noble William S.^ORCID

Abstract

AbstractSemi-automated genome annotation methods such as Segway enable understanding of chromatin activity. Here we present chromatin state annotations of 164 human cell types using 1,615 genomics data sets. To produce these annotations, we developed a fully-automated annotation strategy in which we train separate unsupervised annotation models on each cell type and use a machine learning classifier to automate the state interpretation step. Using these annotations, we developed a measure of the importance of each genomic position called the “conservation-associated activity score,” which we use to aggregate information across cell types into a multi-cell type view. The aggregated conservation-associated activity score provides a measure of importance directly attributable to a specific activity in a specific set of cell types. In contrast to evolutionary conservation, this measure is not biased to detect only elements shared with related species. Using the conservation-associated activity score, we combined all our annotations into a single, cell type-agnostic encyclopedia that catalogs all human transcriptional and regulatory elements, enabling easy and intuitive interpretation of the effect of genome variants on phenotype, such as in disease-associated, evolutionarily conserved or positively selected loci. These resources, including cell type-specific annotations, encyclopedia, and a visualization server, are available at http://noble.gs.washington.edu/proj/encyclopedia.Author SummaryGenome annotation algorithms are an effective class of tools for understanding the function of the genome. These algorithms take as input a set of genome-wide measurements about the activity at each base pair in a given tissue, such as where a given protein is binding or how accessible the DNA is to being read by a protein. The genome is then partitioned and each segment is assigned a label such that positions with the same label exhibit similar patterns in the input data. Such annotations are widely used for many applications, such as to understand the mechanism of impact of a given genetic variant. Here we present, to our knowledge, the most comprehensive set of genome annotations created so far, encompassing 164 human cell types and including 1,615 genomics data sets. These comprehensive annotations are made possible by a strategy that automates the previous interpretation step. Furthermore, we present several methodological innovations that make these genome annotations more useful.

Publisher

Cold Spring Harbor Laboratory

Reference41 articles.

1. Unsupervised segmentation of continuous genomic data

2. Discovery and characterization of chromatin states for systematic annotation of the human genome

3. Unsupervised pattern discovery in human chromatin structure through genomic segmentation;Nature Methods,2012

4. Identification of higher-order functional domains in the human ENCODE regions

5. Automated mapping of large-scale chromatin structure in ENCODE

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Avocado: a multi-scale deep tensor factorization method learns a latent representation of the human epigenome;Genome Biology;2020-03-30

2. Continuous chromatin state feature annotation of the human epigenome;2018-11-18

3. Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data;Molecular Cell;2018-09

4. Multi-scale deep tensor factorization learns a latent representation of the human epigenome;2018-07-08

5. FUN-LDA: A Latent Dirichlet Allocation Model for Predicting Tissue-Specific Functional Effects of Noncoding Variation: Methods and Applications;The American Journal of Human Genetics;2018-05