The divisible load balance problem with shared cost and its application to phylogenetic inference-Reference-Cited by-同舟云学术

The divisible load balance problem with shared cost and its application to phylogenetic inference

Published:2016-01-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Scholl Constantin,Kobert Kassian,Flouri Tomáš,Stamatakis Alexandros

Abstract

AbstractMotivated by load balance issues in parallel calculations of the phylogenetic likelihood function, we recently introduced an approximation algorithm for efficiently distributing partitioned alignment data to a given number of CPUs. The goal is to balance the accumulated number of sites per CPU, and, at the same time, to minimize the maximum number of unique partitions per CPU. The approximation algorithm assumes that likelihood calculations on individual alignment sites have identical runtimes and that likelihood calculation times on distinct sites are entirely independent from each other. However, a recently introduced optimization of the phylogenetic likelihood function, the so-called site repeats technique, violates both aforementioned assumptions. To this end, we modify our data distribution algorithm and explore 72 distinct heuristic strategies that take into account the additional restrictions induced by site repeats, to yield a ‘good’ parallel load balance.Our best heuristic strategy yields a reduction in required arithmetic operations that ranges between 2% and 92% with an average of 62% for all test datasets using 2, 4, 8, 16, 32, and 64 CPUs compared to the original site-repeat-agnostic data distribution algorithm.

Publisher

Cold Spring Harbor Laboratory

Reference16 articles.

1. Z. Yang , Computational Molecular Evolution. Oxford Series in Ecology and Evolution, 2006.

2. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies

3. MRBAYES: Bayesian inference of phylogenetic trees

4. N. Alachiotis and A. Stamatakis , “A generic and versatile architecture for inference of evolutionary trees under maximum likelihood,” in Signals, Systems and Computers (ASILOMAR), 2010 Conference Record of the Forty Fourth Asilomar Conference on. IEEE, 2010, pp. 829–835.

5. J. Zhang and A. Stamatakis , “The multi-processor scheduling problem in phylogenetics,” in Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), 2012 IEEE 26th International, May 2012, pp. 691–698.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Efficient Detection of Repeating Sites to Accelerate Phylogenetic Likelihood Calculations;Systematic Biology;2016-08-29