Affiliation:
1. Computer Science, University of California - Riverside, 900 University Avenue, Riverside, California 92521, USA
Abstract
Inferring the haplotypes of the members of a pedigree from their genotypes has been extensively studied. However, most studies do not consider genotyping errors and de novo mutations. In this paper, we study how to infer haplotypes from genotype data that may contain genotyping errors, de novo mutations, and missing alleles. We assume that there are no recombinants in the genotype data, which is usually true for tightly linked markers. We introduce a combinatorial optimization problem, called haplotype configuration with mutations and errors (HCME), which calls for haplotype configurations consistent with the given genotypes that incur no recombinants and require the minimum number of mutations and errors. HCME is NP-hard. To solve the problem, we propose a heuristic algorithm, the core of which is an integer linear program (ILP) using the system of linear equations over Galois field GF(2). Our algorithm can detect and locate genotyping errors that cannot be detected by simply checking the Mendelian law of inheritance. The algorithm also offers error correction in genotypes/haplotypes rather than just detecting inconsistencies and deleting the involved loci. Our experimental results show that the algorithm can infer haplotypes with a very high accuracy and recover 65%–94% of genotyping errors depending on the pedigree topology.
Publisher
World Scientific Pub Co Pte Ltd
Subject
Computer Science Applications,Molecular Biology,Biochemistry
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献