Author:
Park Yonil,Sheetlin Sergey,Ma Ning,Madden Thomas L,Spouge John L
Abstract
Abstract
Background
Local alignment programs often calculate the probability that a match occurred by chance. The calculation of this probability may require a “finite-size” correction to the lengths of the sequences, as an alignment that starts near the end of either sequence may run out of sequence before achieving a significant score.
Findings
We present an improved finite-size correction that considers the distribution of sequence lengths rather than simply the corresponding means. This approach improves sensitivity and avoids substituting an ad hoc length for short sequences that can underestimate the significance of a match. We use a test set derived from ASTRAL to show improved ROC scores, especially for shorter sequences.
Conclusions
The new finite-size correction improves the calculation of probabilities for a local alignment. It is now used in the BLAST+ package and at the NCBI BLAST web site (http://blast.ncbi.nlm.nih.gov).
Publisher
Springer Science and Business Media LLC
Subject
General Biochemistry, Genetics and Molecular Biology,General Medicine
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献