Using visual scores and categorical data for genomic prediction of complex traits in breeding programs-Reference-Cited by-同舟云学术

Using visual scores and categorical data for genomic prediction of complex traits in breeding programs

Published:2023-04-04 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Azevedo Camila Ferreira¹,Ferrão Luis Felipe Ventorim²,Benevenuto Juliana²,de Resende Marcos Deon Vilela³,Nascimento Moyses¹,Nascimento Ana Carolina Campana¹,Munoz Patricio R²^ORCID

Affiliation:

1. Federal University of Vicosa: Universidade Federal de Vicosa

2. University of Florida

3. Empresa Brasileira de Pesquisa Agropecuaria Florestas

Abstract

Abstract Most genomic prediction methods are based on assumptions of normality due to their simplicity and ease of implementation. However, in plant and animal breeding, traits are often collected as categorical data, thus violating the normality assumption, which could affect the prediction of breeding values and the estimation of genetic parameters. In this study, we examined the main challenges of categorical phenotypes in genomic prediction and genetic parameter estimation using mixed models, Bayesian and machine learning methods. We evaluated these approaches using simulated and real breeding data sets. Our contribution in this study is a five-fold demonstration: (i) collecting data using an intermediate number of categories (1 to 3 and 1 to 5) is the best strategy, even considering errors associated with visual scores; (ii) Linear Mixed Models and Bayesian Linear Regression are robust to the normality violation, but marginal gains can be achieved when using Bayesian Ordinal Regression Models (BORM) and Random Forest Classification; (iii) genetic parameters are better estimated using BORM; (iv) our conclusions using simulated data are also applicable to real data in autotetraploid blueberry; and (v) a comparison of continuous and categorical phenotypes found that investing in the evaluation of 600–1000 categorical data points with low error, when it is not feasible to collect continuous phenotypes, is a strategy for improving predictive abilities. Our findings suggest the best approaches for effectively using categorical traits to explore genetic information in breeding programs and highlight the importance of investing in the training of evaluator teams and in high-quality phenotyping.

Publisher

Research Square Platform LLC

Reference38 articles.

1. Ridge, Lasso and Bayesian additive-dominance genomic models;Azevedo CF;BMC Genet,2015

2. Long-term comparison between index selection and optimal independent culling in plant breeding programs with genomic prediction;Batista LG;PLoS ONE,2021

3. How can a high-quality genome assembly help plant breeders?;Benevenuto J,2019

4. The effect of mislabeled phenotypic status on the identification of mutation-carriers from SNP genotypes in dairy cattle;Biffani S;BMC Res Notes,2017

5. Butler D (2022) asreml: Fits the Linear Mixed Model. In: R package version 4.1.0.160

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhancing grapevine breeding efficiency through genomic prediction and selection index;2023-08-03