A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library-Reference-Cited by-同舟云学术

A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library

Published:2022-08-21 Issue:8 Volume:13 Page:1494
ISSN:2073-4425
Container-title:Genes
language:en
Short-container-title:Genes

Author:

Montesinos-López Osval A.^ORCID,Montesinos-López Abelardo,Cano-Paez Bernabe,Hernández-Suárez Carlos Moisés^ORCID,Santana-Mancilla Pedro C.^ORCID,Crossa José

Abstract

Genomic selection (GS) changed the way plant breeders select genotypes. GS takes advantage of phenotypic and genotypic information to training a statistical machine learning model, which is used to predict phenotypic (or breeding) values of new lines for which only genotypic information is available. Therefore, many statistical machine learning methods have been proposed for this task. Multi-trait (MT) genomic prediction models take advantage of correlated traits to improve prediction accuracy. Therefore, some multivariate statistical machine learning methods are popular for GS. In this paper, we compare the prediction performance of three MT methods: the MT genomic best linear unbiased predictor (GBLUP), the MT partial least squares (PLS) and the multi-trait random forest (RF) methods. Benchmarking was performed with six real datasets. We found that the three investigated methods produce similar results, but under predictors with genotype (G) and environment (E), that is, E + G, the MT GBLUP achieved superior performance, whereas under predictors E + G + genotype × environment (GE) and G + GE, random forest achieved the best results. We also found that the best predictions were achieved under the predictors E + G and E + G + GE. Here, we also provide the R code for the implementation of these three statistical machine learning methods in the sparse kernel method (SKM) library, which offers not only options for single-trait prediction with various statistical machine learning methods but also some options for MT predictions that can help to capture improved complex patterns in datasets that are common in genomic selection.

Funder

Bill and Melinda Gates Foundation

Publisher

MDPI AG

Subject

Genetics (clinical),Genetics

Link

https://www.mdpi.com/2073-4425/13/8/1494/pdf

Reference38 articles.

1. Breeding schemes for the implementation of genomic selection in wheat ( Triticum spp . )

2. Genomic Selection for Processing and End‐Use Quality Traits in the CIMMYT Spring Bread Wheat Breeding Program

3. Genomic Selection in the Era of Next Generation Sequencing for Complex Traits in Plant Breeding