Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking-Reference-Cited by-同舟云学术

Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking

Published:2013-02-01 Issue:2 Volume:193 Page:347-365
ISSN:1943-2631
Container-title:Genetics
language:en
Short-container-title:

Author:

Daetwyler Hans D¹,Calus Mario P L²,Pong-Wong Ricardo³,de los Campos Gustavo⁴,Hickey John M⁵⁶

Affiliation:

1. Biosciences Research Division, Department of Primary Industries, Bundoora 3083, Victoria, Australia

2. Animal Breeding and Genomics Centre, Wageningen University Research Livestock Research, 8200 AB Lelystad, The Netherlands

3. The Roslin Institute, Royal Dick School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG Scotland, United Kingdom

4. Department of Biostatistics, School of Public Health, University of Alabama, Birmingham, Alabama 35294

5. School of Environmental and Rural Science, University of New England, Armidale 2351, New South Wales, Australia

6. Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), 06600 Mexico, D.F., Mexico

Abstract

Abstract The genomic prediction of phenotypes and breeding values in animals and plants has developed rapidly into its own research field. Results of genomic prediction studies are often difficult to compare because data simulation varies, real or simulated data are not fully described, and not all relevant results are reported. In addition, some new methods have been compared only in limited genetic architectures, leading to potentially misleading conclusions. In this article we review simulation procedures, discuss validation and reporting of results, and apply benchmark procedures for a variety of genomic prediction methods in simulated and real example data. Plant and animal breeding programs are being transformed by the use of genomic data, which are becoming widely available and cost-effective to predict genetic merit. A large number of genomic prediction studies have been published using both simulated and real data. The relative novelty of this area of research has made the development of scientific conventions difficult with regard to description of the real data, simulation of genomes, validation and reporting of results, and forward in time methods. In this review article we discuss the generation of simulated genotype and phenotype data, using approaches such as the coalescent and forward in time simulation. We outline ways to validate simulated data and genomic prediction results, including cross-validation. The accuracy and bias of genomic prediction are highlighted as performance indicators that should be reported. We suggest that a measure of relatedness between the reference and validation individuals be reported, as its impact on the accuracy of genomic prediction is substantial. A large number of methods were compared in example simulated and real (pine and wheat) data sets, all of which are publicly available. In our limited simulations, most methods performed similarly in traits with a large number of quantitative trait loci (QTL), whereas in traits with fewer QTL variable selection did have some advantages. In the real data sets examined here all methods had very similar accuracies. We conclude that no single method can serve as a benchmark for genomic prediction. We recommend comparing accuracy and bias of new methods to results from genomic best linear prediction and a variable selection approach (e.g., BayesB), because, together, these methods are appropriate for a range of genetic architectures. An accompanying article in this issue provides a comprehensive review of genomic prediction methods and discusses a selection of topics related to application of genomic prediction in plants and animals.

Publisher

Oxford University Press (OUP)

Subject

Genetics

Link

https://academic.oup.com/genetics/article-pdf/193/2/347/42120474/genetics0347.pdf

Reference112 articles.

1. Implications of avoiding overlap between training and testing data sets when evaluating genomic predictions of genetic merit;Amer;J. Dairy Sci.,2010

2. Prospects for genomewide selection for quantitative traits in maize;Bernardo;Crop Sci.,2007

3. Accuracies of estimated breeding values from ordinary genetic evaluations do not reflect the correlation between true and estimated breeding values in selected populations;Bijma;J. Anim. Breed. Genet.,2012

4. Accuracy of multi-trait genomic selection using different methods;Calus;Genet. Sel. Evol.,2011

5. Genomic breeding value prediction: methods and procedures;Calus;Animal,2010

Cited by 353 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Predicting rice phenology across China by integrating crop phenology model and machine learning;Science of The Total Environment;2024-11

2. High-throughput seed quality analysis in faba bean: leveraging Near-InfraRed spectroscopy (NIRS) data and statistical methods;Food Chemistry: X;2024-10

3. Performance of phenomic selection in rice: effects of population size and genotype-environment interactions on predictive ability;2024-08-15

4. The hazard prediction problem;Safety Science;2024-08

5. Enhancing genomic prediction with Stacking Ensemble Learning in Arabica Coffee;Frontiers in Plant Science;2024-07-17