Multiple seeds sensitivity using a single seed with threshold-Reference-Cited by-同舟云学术

Multiple seeds sensitivity using a single seed with threshold

Published:2015-08 Issue:04 Volume:13 Page:1550011
ISSN:0219-7200
Container-title:Journal of Bioinformatics and Computational Biology
language:en
Short-container-title:J. Bioinform. Comput. Biol.

Author:

Egidi Lavinia¹,Manzini Giovanni¹

Affiliation:

1. DiSIT, Computer Science Institute, Università del Piemonte Orientale, Alessandria, I-15100, Italy

Abstract

Spaced seeds are a fundamental tool for similarity search in biosequences. The best sensitivity/selectivity trade-offs are obtained using many seeds simultaneously: This is known as the multiple seed approach. Unfortunately, spaced seeds use a large amount of memory and the available RAM is a practical limit to the number of seeds one can use simultaneously. Inspired by some recent results on lossless seeds, we revisit the approach of using a single spaced seed and considering two regions homologous if the seed hits in at least t sufficiently close positions. We show that by choosing the locations of the don't care symbols in the seed using quadratic residues modulo a prime number, we derive single seeds that when used with a threshold t > 1 have competitive sensitivity/selectivity trade-offs, indeed close to the best multiple seeds known in the literature. In addition, the choice of the threshold t can be adjusted to modify sensitivity and selectivity a posteriori, thus enabling a more accurate search in the specific instance at issue. The seeds we propose also exhibit robustness and allow flexibility in usage.

Publisher

World Scientific Pub Co Pte Lt

Subject

Computer Science Applications,Molecular Biology,Biochemistry

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0219720015500110

Reference20 articles.

1. PatternHunter: faster and more sensitive homology search

2. D. G. Brown, Bioinformatics Algorithms: Techniques and Applications, eds. I. Mǎndoiu and A. Zelikovsky (Wiley-Interscience, Hoboken, New Jersey, 2008) pp. 126–152, DOI: 10.1002/9780470253441.ch6.

3. SHRiMP2: Sensitive yet Practical Short Read Mapping

4. BFAST: An Alignment Tool for Large Scale Genome Resequencing

5. PATTERNHUNTER II: HIGHLY SENSITIVE AND FAST HOMOLOGY SEARCH

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. ALeS: adaptive-length spaced-seed design;Bioinformatics;2020-12-07

2. Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds;Algorithms for Molecular Biology;2017-02-14

3. Optimal seed solver: optimizing seed selection in read mapping;Bioinformatics;2015-11-14

4. Spaced seeds improvek-mer-based metagenomic classification;Bioinformatics;2015-07-25