Fast comparison of DNA sequences by oligonucleotide profiling-Reference-Cited by-同舟云学术

Fast comparison of DNA sequences by oligonucleotide profiling

Published:2008-02-28 Issue:1 Volume:1 Page:
ISSN:1756-0500
Container-title:BMC Research Notes
language:en
Short-container-title:BMC Res Notes

Author:

Arnau Vicente,Gallach Miguel,Marín Ignacio

Abstract

Abstract Background The comparison of DNA sequences is a traditional problem in genomics and bioinformatics. Many new opportunities emerge due to the improvement of personal computers, allowing the implementation of novel strategies of analysis. Findings We describe a new program, called UVWORD, which determines the number of times that each DNA word present in a sequence (target) is found in a second sequence (source), a procedure that we have called oligonucleotide profiling. On a standard computer, the user may search for words of a size ranging from k = 1 to k = 14 nucleotides. Average counts for groups of contiguous words may also be established. The rate of analysis on standard computers is from 3.4 (k = 14) to 16 millions of words per second (1 ≤ k ≤ 8). This makes feasible the fast screening of even the longest known DNA molecules. Discussion We show that the combination of the ability of analyzing words of relatively long size, which occur very rarely by chance, and the fast speed of the program allows to perform novel types of screenings, complementary to those provided by standard programs such as BLAST. This method can be used to determine oligonucleotide content, to characterize the distribution of repetitive sequences in chromosomes, to determine the evolutionary conservation of sequences in different species, to establish regions of similar DNA among chromosomes or genomes, etc.

Publisher

Springer Science and Business Media LLC

Subject

General Biochemistry, Genetics and Molecular Biology,General Medicine

Link

https://link.springer.com/content/pdf/10.1186/1756-0500-1-5.pdf

Reference21 articles.

1. Vinga S, Almeida J: Alignment-free sequence comparison – a review. Bioinformatics. 2003, 19: 513-523. 10.1093/bioinformatics/btg005.

2. Karlin S, Campbell AM, Mrázek J: Comparative DNA analysis across diverse genomes. Annu Rev Genet. 1998, 32: 185-225. 10.1146/annurev.genet.32.1.185.

3. Levy S, Compagnoni L, Myers EW, Stormo GD: Xlandscape: the graphical display of word frequencies in sequences. Bioinformatics. 1998, 14: 74-80. 10.1093/bioinformatics/14.1.74.

4. Kent WJ: BLAT – The BLAST-like alignment tool. Genome Res. 2002, 12: 656-664. 10.1101/gr.229202. Article published online before March 2002.

5. Healy J, Thomas EE, Schwartz JT, Wigler M: Annotating large genomes with exact word matches. Genome Res. 2003, 13: 2306-2315. 10.1101/gr.1350803.

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Spectrum structures and biological functions of 8-mers in the human genome;Genomics;2019-05

2. Recurrent Turnover of Chromosome-Specific Satellites in Drosophila;Genome Biology and Evolution;2014-05-19

3. Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis;Briefings in Bioinformatics;2013-07-31

4. Clustering of DNA words and biological function: A proof of principle;Journal of Theoretical Biology;2012-03

5. Further Improvement in Quantifying Male Fetal DNA in Maternal Plasma;Clinical Chemistry;2012-02-01