Author:
Nguyen Ken D,Pan Yi,Nong Ge
Abstract
Abstract
Background
One of the most fundamental and challenging tasks in bio-informatics is to identify related sequences and their hidden biological significance. The most popular and proven best practice method to accomplish this task is aligning multiple sequences together. However, multiple sequence alignment is a computing extensive task. In addition, the advancement in DNA/RNA and Protein sequencing techniques has created a vast amount of sequences to be analyzed that exceeding the capability of traditional computing models. Therefore, an effective parallel multiple sequence alignment model capable of resolving these issues is in a great demand.
Results
We design O(1) run-time solutions for both local and global dynamic programming pair-wise alignment algorithms on reconfigurable mesh computing model. To align m sequences with max length n, we combining the parallel pair-wise dynamic programming solutions with newly designed parallel components. We successfully reduce the progressive multiple sequence alignment algorithm's run-time complexity from O(m × n
4) to O(m) using O(m × n
3) processing units for scoring schemes that use three distinct values for match/mismatch/gap-extension. The general solution to multiple sequence alignment algorithm takes O(m × n
4) processing units and completes in O(m) time.
Conclusions
To our knowledge, this is the first time the progressive multiple sequence alignment algorithm is completely parallelized with O(m) run-time. We also provide a new parallel algorithm for the Longest Common Subsequence (LCS) with O(1) run-time using O(n
3) processing units. This is a big improvement over the current best constant-time algorithm that uses O(n
4) processing units.
Publisher
Springer Science and Business Media LLC
Reference41 articles.
1. Rosenberg MS, (Ed): Sequence alignment - methods, models, concepts, and strategies. 2009, University of California Press
2. Wang L, Jiang T: On the complexity of multiple sequence alignment. J Comput Biol. 1994, 1 (4): 337-48. 10.1089/cmb.1994.1.337.
3. Dayhoff MO, Schwartz RM, Orcutt BC: A model of evolutionary change in proteins: matrices for detecting distant relationships. Atlas of Protein Sequence and Structure. 1978, 5 (Suppl 3): 345-358.
4. Henikoff S, Henikoff JG: Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences. 1992, 89 (22): 10915-10919. 10.1073/pnas.89.22.10915.
5. Thompson J, Higgins D, Gibson T, et al: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22 (22): 4673-4680. 10.1093/nar/22.22.4673.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献