Abstract
AbstractStructural changes of chromatin modulate access to DNA for all proteins involved in transcription. These changes are linked to variations in epigenetic marks that allow to classify chromatin in different functional states depending on the pattern of these marks. Importantly, alterations in chromatin states are known to be linked with various diseases. For example, there are abnormalities in epigenetic patterns in different types of cancer. For most of these diseases, there is not enough epigenomic data available to accurately determine chromatin states for the cells affected in each of them, mainly due to high costs of performing this type of experiments but also because of lack of a sufficient amount of sample or degradation thereof.In this work we describe a cascade method based on a random forest algorithm to infer epigenetic marks, and by doing so, to reduce the number of experimentally determined marks required to assign chromatin states. Our approach identified several relationships between patterns of different marks, which strengthens the evidence in favor of a redundant epigenetic code.
Publisher
Cold Spring Harbor Laboratory