Definition of metafounders based on population structure analysis

Author:

Anglhuber ChristineORCID,Edel Christian,Pimentel Eduardo C. G.,Emmerling Reiner,Götz Kay-Uwe,Thaller Georg

Abstract

Abstract Background Limitations of the concept of identity by descent in the presence of stratification within a breeding population may lead to an incomplete formulation of the conventional numerator relationship matrix ($$\mathbf{A}$$ A ). Combining $$\mathbf{A}$$ A with the genomic relationship matrix ($$\mathbf{G}$$ G ) in a single-step approach for genetic evaluation may cause inconsistencies that can be a source of bias in the resulting predictions. The objective of this study was to identify stratification using genomic data and to transfer this information to matrix $$\mathbf{A}$$ A , to improve the compatibility of $$\mathbf{A}$$ A and $$\mathbf{G}$$ G . Methods Using software to detect population stratification (ADMIXTURE), we developed an iterative approach. First, we identified 2 to 40 strata ($$k$$ k ) with ADMIXTURE, which we then introduced in a stepwise manner into matrix $$\mathbf{A}$$ A , to generate matrix $${\mathbf{A}}^{{\varvec{\Gamma}}}$$ A Γ using the metafounder methodology. Improvements in consistency between matrix $$\mathbf{G}$$ G and $${\mathbf{A}}^{{\varvec{\Gamma}}}$$ A Γ were evaluated by regression analysis and through the comparison of the overall mean and mean diagonal values of both matrices. The approach was tested on genotype and pedigree information of European and North American Brown Swiss animals (85,249). Analyses with ADMIXTURE were initially performed on the full set of genotypes (S1). In addition, we used an alternative dataset where we avoided sampling of closely related animals (S2). Results Results of the regression analyses of standard $$\mathbf{A}$$ A on $$\mathbf{G}$$ G were – 0.489, 0.780 and 0.647 for intercept, slope and fit of the regression. When analysing S1 data results of the regression for $${\mathbf{A}}^{{\varvec{\Gamma}}}$$ A Γ on $$\mathbf{G}$$ G corresponding values were – 0.028, 1.087 and 0.807 for $$k$$ k =7, while there was no clear optimum $$k$$ k . Analyses of S2 gave a clear optimal $$k$$ k =24, with − 0.020, 0.998 and 0.817 as results of the regression. For this $$k$$ k differences in mean and mean diagonal values between both matrices were negligible. Conclusions The derivation of hidden stratification information based on genotyped animals and its integration into $$\mathbf{A}$$ A improved compatibility of the resulting $${\mathbf{A}}^{{\varvec{\Gamma}}}$$ A Γ and $$\mathbf{G}$$ G considerably compared to the initial situation. In dairy breeding populations with large half-sib families as sub-structures it is necessary to balance the data when applying population structure analysis to obtain meaningful results.

Funder

Arbeitsgemeinschaft Süddeutscher Rinderzucht- und Besamungsorganisationen e.V.

Bayerische Landesanstalt für Landwirtschaft

Publisher

Springer Science and Business Media LLC

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3