The use of a genomic relationship matrix for breed assignment of cattle breeds: comparison and combination with a machine learning method-Reference-Cited by-同舟云学术

The use of a genomic relationship matrix for breed assignment of cattle breeds: comparison and combination with a machine learning method

Published:2023-01-01 Issue: Volume:101 Page:
ISSN:0021-8812
Container-title:Journal of Animal Science
language:en
Short-container-title:

Author:

Wilmot Hélène¹²^ORCID,Niehoff Tobias³,Soyeurt Hélène²,Gengler Nicolas²^ORCID,Calus Mario P L³^ORCID

Affiliation:

1. National Fund for Scientific Research (F.R.S.-FNRS) , B-1000 Brussels , Belgium

2. TERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège , B-5030 Gembloux , Belgium

3. Animal Breeding and Genomics, Wageningen University and Research , 6700AH Wageningen , the Netherlands

Abstract

Abstract To develop a breed assignment model, three main steps are generally followed: 1) The selection of breed informative single nucleotide polymorphism (SNP); 2) The training of a model, based on a reference population, that allows to classify animals to their breed of origin; and 3) The validation of the developed model on external animals i.e., that were not used in previous steps. However, there is no consensus in the literature about which methodology to follow for the first step, nor about the number of SNP to be selected. This can raise many questions when developing the model and lead to the use of sophisticated methodologies for selecting SNP (e.g., with iterative algorithms, partitions of SNP, or combination of several methods). Therefore, it may be of interest to avoid the first step by the use of all the available SNP. For this purpose, we propose the use of a genomic relationship matrix (GRM), combined or not with a machine learning method, for breed assignment. We compared it with a previously developed model based on selected informative SNP. Four methodologies were investigated: 1) The PLS_NSC methodology: selection of SNP based on a partial least square-discriminant analysis (PLS-DA) and breed assignment by classification based on the nearest shrunken centroids (NSC) method; 2) Breed assignment based on the highest mean relatedness of an animal to the reference populations of each breed (referred to mean_GRM); 3) Breed assignment based on the highest SD of the relatedness of an animal to the reference populations of each breed (referred to SD_GRM) and 4) The GRM_SVM methodology: the use of means and SD of the relatedness defined in mean_GRM and SD_GRM methodologies combined with the linear support vector machine (SVM), a machine learning method used for classification. Regarding mean global accuracies, results showed that the use of mean_GRM or GRM_SVM was not significantly different (Bonferroni corrected P > 0.0083) than the model based on a reduced SNP panel (PLS_NSC). Moreover, the mean_GRM and GRM_SVM methodology were more efficient than PLS_NSC as it was faster to compute. Therefore, it is possible to bypass the selection of SNP and, by the use of a GRM, to develop an efficient breed assignment model. In routine, we recommend the use of GRM_SVM over mean_GRM as it gave a slightly increased global accuracy, which can help endangered breeds to be maintained. The script to execute the different methodologies can be accessed on: https://github.com/hwilmot675/Breed_assignment.

Funder

Fonds De La Recherche Scientifique - FNRS

Wallonia-Brussels Federation

Publisher

Oxford University Press (OUP)

Subject

Genetics,Animal Science and Zoology,General Medicine,Food Science

Link

https://academic.oup.com/jas/advance-article-pdf/doi/10.1093/jas/skad172/50426008/skad172.pdf

Reference27 articles.

1. Preselection statistics and Random Forest classification identify population informative single nucleotide polymorphisms in cosmopolitan and autochthonous cattle breeds;Bertolini;Animal,2018

2. Evaluating the replicability of significance tests for comparing learning algorithms.;Bouckaert,2004

3. Calc_grm—a program to com- pute pedigree, genomic, and combined relationship matrices.;Calus,2016

4. Genomic breed prediction in New Zealand sheep;Dodds;BMC Genet,2014

5. Estimation of genome-wide and locus-specific breed composition in pigs;Funkhouser;Transl. Anim. Sci,2017

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An overview of recent technological developments in bovine genomics;Veterinary and Animal Science;2024-09

2. Genetic Distinctness and Diversity of American Aberdeen Cattle Compared to Common Beef Breeds in the United States;Genes;2023-09-22