Abstract
AbstractDNA methylation is an epigenetic event that plays an important role in regulating gene expression. It is important to study DNA methylation, especially differential methylation patterns between two groups of samples (e.g. patients vs. normal individuals). With next generation sequencing technologies, it is now possible to identify differential methylation patterns by considering methylation at the single CG site level in an entire genome. However, it is challenging to analyze large and complex NGS data. In order to address this difficult question, we have developed a new statistical method using a hidden Markov model and Fisher’s exact test (HMM-Fisher) to identify differentially methylated cytosines and regions. We first use a hidden Markov chain to model the methylation signals to infer the methylation state as Not methylated (N), Partly methylated (P), and Fully methylated (F) for each individual sample. We then use Fisher’s exact test to identify differentially methylated CG sites. We show the HMM-Fisher method and compare it with commonly cited methods using both simulated data and real sequencing data. The results show that HMM-Fisher outperforms the current available methods to which we have compared. HMM-Fisher is efficient and robust in identifying heterogeneous DM regions.
Subject
Computational Mathematics,Genetics,Molecular Biology,Statistics and Probability
Reference140 articles.
1. from whole genome bisulfite sequencing reads to differentially methylated regions;Hansen;Genome Biol,2012
2. HMM DM repository https github com xxy HMM DM;Yu,2016
3. binding is inhibited by methylation;Clark;Gene,1997
4. streamlined analysis and annotation pipeline for reduced representation bisulfite sequencing;Sun;Bioinformatics,2012
5. Increased methylation variation in epigenetic domains across cancer types;Hansen;Genet,2011
Cited by
25 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献