Author:
Lai En-Yu,Wang Wei-Bung,Jiang Tao,Wu Kun-Pin
Abstract
Abstract
Background
When studying genetic diseases in which genetic variations are passed on to offspring, the ability to distinguish between paternal and maternal alleles is essential. Determining haplotypes from genotype data is called haplotype inference. Most existing computational algorithms for haplotype inference have been designed to use genotype data collected from individuals in the form of a pedigree. A haplotype is regarded as a hereditary unit and therefore input pedigrees are preferred that are free of mutational events and have a minimum number of genetic recombinational events. These ideas motivated the zero-recombinant haplotype configuration (ZRHC) problem, which strictly follows the Mendelian law of inheritance, namely that one haplotype of each child is inherited from the father and the other haplotype is inherited from the mother, both without any mutation. So far no linear-time algorithm for ZRHC has been proposed for general pedigrees, even though the number of mating loops in a human pedigree is usually very small and can be regarded as constant.
Results
Given a pedigree with n individuals, m marker loci, and k mating loops, we proposed an algorithm that can provide a general solution to the zero-recombinant haplotype configuration problem in O(kmn + k
2
m) time. In addition, this algorithm can be modified to detect inconsistencies within the genotype data without loss of efficiency. The proposed algorithm was subject to 12000 experiments to verify its performance using different (n, m) combinations. The value of k was uniformly distributed between zero and six throughout all experiments. The experimental results show a great linearity in terms of execution time in relation to input size when both n and m are larger than 100. For those experiments where n or m are less than 100, the proposed algorithm runs very fast, in thousandth to hundredth of a second, on a personal desktop computer.
Conclusions
We have developed the first deterministic linear-time algorithm for the zero-recombinant haplotype configuration problem. Our experimental results demonstrated the linearity of its execution time in relation to the input size. The proposed algorithm can be modified to detect inconsistency within the genotype data without loss of efficiency and is expected to be able to handle recombinant and missing data with further extension.
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Reference24 articles.
1. Qian D, Beckmann L: Minimum-recombinant haplotyping in pedigrees. The American Journal of Human Genetics. 2002, 70 (6): 1434-1445. 10.1086/340610.
2. Albers CA, Heskes T, Kappen HJ: Haplotype inference in general pedigrees using the cluster variation method. Genetics. 2007, 177 (2): 1101-1116. 10.1534/genetics.107.074047.
3. Chin FYL, Zhang Q, Shen H: k-recombination haplotype inference in pedigrees. Proceedings of the International Conference on Computational Science (ICCS). 2005, Springer-Verlag, Berlin, 985-993.
4. Li J, Jiang T: Efficient rule-based haplotyping algorithms for pedigree data. Proceedings of the 7th Annual Conference on Research in Computational Molecular Biology (RECOMB). 2003, ACM, New York, 197-206.
5. Li J, Jiang T: An exact solution for finding minimum recombinant haplotype configurations on pedigrees with missing data by integer linear programming. Proceedings of the 8th Annual Conference on Research in Computational Molecular Biology (RECOMB). 2004, ACM, New York, 20-29.
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献