Author:
Montesinos López Osval Antonio,Montesinos López Abelardo,Crossa Jose
Abstract
AbstractNowadays, huge data quantities are collected and analyzed for delivering deep insights into biological processes and human behavior. This chapter assesses the use of big data for prediction and estimation through statistical machine learning and its applications in agriculture and genetics in general, and specifically, for genome-based prediction and selection. First, we point out the importance of data and how the use of data is reshaping our way of living. We also provide the key elements of genomic selection and its potential for plant improvement. In addition, we analyze elements of modeling with machine learning methods applied to genomic selection and stress their importance as a predictive methodology. Two cultures of model building are analyzed and discussed: prediction and inference; by understanding modeling building, researchers will be able to select the best model/method for each circumstance. Within this context, we explain the differences between nonparametric models (predictors are constructed according to information derived from data) and parametric models (all the predictors take predetermined forms with the response) as well their type of effects: fixed, random, and mixed. Basic elements of linear algebra are provided to facilitate understanding the contents of the book. This chapter also contains examples of the different types of data using supervised, unsupervised, and semi-supervised learning methods.
Funder
Bill and Melinda Gates Foundation
Publisher
Springer International Publishing
Reference25 articles.
1. Adler I (1912) Primary malignant growths of the lungs and bronchi: a pathological and clinical study. Longmans, Green and Co, New York and London, p 325
2. Bernardo R (2016) Bandwagons I, too, have known. Theor Appl Genet. https://doi.org/10.1007/s00122-016-2772-5
3. Box GEP (1976) Science and statistics (PDF). J Am Stat Assoc 71:791–799. https://doi.org/10.1080/01621459.1976.10480949
4. Breiman L (2001) Statistical modeling: the two cultures. Stat Sci 16:199–215
5. Crossa J, Pérez-Rodríguez P, Cuevas J, Montesinos-López OA, Jarquín D, de Los Campos G, Burgueño J, González-Camacho JM, Pérez-Elizalde S, Beyene Y, Dreisigacker S, Singh R, Zhang X, Gowda M, Roorkiwal M, Rutkoski J, Varshney RK (2017) Genomic selection in plant breeding: methods, models, and perspectives. Trends Plant Sci 22(11):961–975
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献