Abstract
AbstractMotif enrichment algorithms can identify known sequence motifs that are present to a statistically significant degree in DNA, RNA and protein sequences. Databases of such known motifs exist for DNA- and RNA-binding proteins, as well as for many functional protein motifs. The SEA (“Simple Enrichment Analysis”) algorithm presented here uses a simple, consistent approach for detecting the enrichment of motifs in DNA, RNA or protein sequences, as well as in sequences using user-defined alphabets. SEA can identify known motifs that are enriched in a single set of input sequences, and can also perform differential motif enrichment analysis when presented with an additional set of control sequences. Using in vivo DNA (ChIP-seq) data as input to SEA, and validating motifs with reference motifs derived from in vitro data, we show that SEA is is faster than three widely-used motif enrichment algorithms (AME, CentriMo and Pscan), while delivering comparable accuracy. We also show that, in contrast to other motif enrichment algorithms, SEA reports accurate estimates of statistical significance. SEA is easy to use via its web server at https://meme-suite.org, and is fully integrated with the widely-used MEME Suite of sequence analysis tools, which can be freely downloaded at the same web site for non-commercial use.
Publisher
Cold Spring Harbor Laboratory
Reference11 articles.
1. The value of prior knowledge in discovering motifs with MEME;Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology, Cambridge, United Kingdom,1995
2. An integrated encyclopedia of DNA elements in the human genome
3. O. Fornes , J. A. Castro-Mondragon , A. Khan , R. van der Lee , X. Zhang , P. A. Richmond , B. P. Modi , S. Correard , M. Gheorghe , D. Baranašić , W. Santana-Garcia , G. Tan , J. Chèneby , B. Ballester , F. Parcy , A. Sandelin , B. Lenhard , W. W. Wasserman , and A. Mathelier . Jaspar 2020: update of the open-access database of transcription factor binding profiles. Nucleic acids research, Nov. 2019.
4. Detection of functional DNA motifs via statistical over-representation
5. DNA-Binding Specificities of Human Transcription Factors