IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences-Reference-Cited by-同舟云学术

IUPACpal: efficient identification of inverted repeats in IUPAC-encoded DNA sequences

Published:2021-02-06 Issue:1 Volume:22 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Alamro Hayam,Alzamel Mai,Iliopoulos Costas S.,Pissis Solon P.^ORCID,Watts Steven

Abstract

Abstract Background An inverted repeat is a DNA sequence followed downstream by its reverse complement, potentially with a gap in the centre. Inverted repeats are found in both prokaryotic and eukaryotic genomes and they have been linked with countless possible functions. Many international consortia provide a comprehensive description of common genetic variation making alternative sequence representations, such as IUPAC encoding, necessary for leveraging the full potential of such broad variation datasets. Results We present IUPACpal, an exact tool for efficient identification of inverted repeats in IUPAC-encoded DNA sequences allowing also for potential mismatches and gaps in the inverted repeats. Conclusion Within the parameters that were tested, our experimental results show that IUPACpal compares favourably to a similar application packaged with EMBOSS. We show that IUPACpal identifies many previously unidentified inverted repeats when compared with EMBOSS, and that this is also performed with orders of magnitude improved speed.

Funder

Engineering and Physical Sciences Research Council

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

http://link.springer.com/content/pdf/10.1186/s12859-021-03983-2.pdf

Reference28 articles.

1. Ussery DW, Wassenaar TM, Borini S. Computing for comparative microbial genomics: bioinformatics for microbiologists, vol. 8. Berlin: Springer; 2009.

2. Pearson CE, Zorbas H, Price GB, Zannis-Hadjopoulos M. Inverted repeats, stem-loops, and cruciforms: significance for initiation of DNA replication. J Cell Biochem. 1996;63(1):1–22.

3. Brázda V, Bartas M, Lỳsek J, Coufal J, Fojta M. Global analysis of inverted repeat sequences in human gene promoters reveals their non-random distribution and association with specific biological pathways. Genomics. 2020.