Comparison of machine learning methods for genomic prediction of selected Arabidopsis thaliana traits-Reference-Cited by-同舟云学术

Comparison of machine learning methods for genomic prediction of selected Arabidopsis thaliana traits

Published:2024-08-28 Issue:8 Volume:19 Page:e0308962
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Kelly Ciaran Michael^ORCID,McLaughlin Russell Lewis

Abstract

We present a comparison of machine learning methods for the prediction of four quantitative traits in Arabidopsis thaliana. High prediction accuracies were achieved on individuals grown under standardized laboratory conditions from the 1001 Arabidopsis Genomes Project. An existing body of evidence suggests that linear models may be impeded by their inability to make use of non-additive effects to explain phenotypic variation at the population level. The results presented here use a nested cross-validation approach to confirm that some machine learning methods have the ability to statistically outperform linear prediction models, with the optimal model dependent on availability of training data and genetic architecture of the trait in question. Linear models were competitive in their performance as per previous work, though the neural network class of predictors was observed to be the most accurate and robust for traits with high heritability. The extent to which non-linear models exploit interaction effects will require further investigation of the causal pathways that lay behind their predictions. Future work utilizing more traits and larger sample sizes, combined with an improved understanding of their respective genetic architectures, may lead to improvements in prediction accuracy.

Funder

Science Foundation Ireland

Motor Neurone Disease Association

Publisher

Public Library of Science (PLoS)

Reference57 articles.

1. Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps;THE Meuwissen;Genetics,2001

2. Molecular Plant Breeding as the Foundation for 21st Century Crop Improvement;SP Moose;Plant Physiology,2008

3. Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans;HJ Cordell;Human molecular genetics,2002

4. Genome-wide association studies in plants: the missing heritability is in the field;B Brachi;Genome Biology,2011