Author:
Selle Maria Lie,Steinsland Ingelin,Lindgren Finn,Brajkovic Vladimir,Cubric-Curik Vlatka,Gorjanc Gregor
Abstract
We introduce a hierarchical model to estimate haplotype effects based on phylogenetic relationships between haplotypes and their association with observed phenotypes. In a population there are many, but not all possible, distinct haplotypes and few observations per haplotype. Further, haplotype frequencies tend to vary substantially. Such data structure challenge estimation of haplotype effects. However, haplotypes often differ only due to few mutations, and leveraging similarities can improve the estimation of effects. We build on extensive literature and develop an autoregressive model of order one that models haplotype effects by leveraging phylogenetic relationships described with a directed acyclic graph. The phylogenetic relationships can be either in a form of a tree or a network, and we refer to the model as the haplotype network model. The model can be included as a component in a phenotype model to estimate associations between haplotypes and phenotypes. Our key contribution is that we obtain a sparse model, and by using hierarchical autoregression, the flow of information between similar haplotypes is estimated from the data. A simulation study shows that the hierarchical model can improve estimates of haplotype effects compared to an independent haplotype model, especially with few observations for a specific haplotype. We also compared it to a mutation model and observed comparable performance, though the haplotype model has the potential to capture background specific effects. We demonstrate the model with a study of mitochondrial haplotype effects on milk yield in cattle. We provide R code to fit the model with the INLA package.
Funder
Norges Forskningsråd
Biotechnology and Biological Sciences Research Council
Hrvatska Zaklada za Znanost
Subject
Genetics(clinical),Genetics,Molecular Medicine
Reference70 articles.
1. A tutorial on statistical methods for population association studies;Balding;Nat. Rev. Genet,2006
2. Modeling and estimation of multiresolution stochastic processes;Basseville;IEEE Trans. Inform. Theory,1992
3. Efficient bayesian inference of general gaussian models on large phylogenetic trees;Bastide;arXiv [Preprint],2020
4. Modeling stabilizing selection: expanding the Ornstein-Uhlenbeck model of adaptive evolution;Beaulieu;Evol. Int. J. Organ. Evol,2012
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献