Affiliation:
1. Département Qualité des Productions Agricole, Centre de Recherches Agronomiques de Gembloux-CRAGx, 24 Chaussée de Namur, B-5030 Gembloux, Belgium
Abstract
The four most important regression methods are evaluated on very large data sets: Multiple Linear Regression (MLR), Partial Least Squares (PLS), Artificial Neural Network (ANN) and a new concept called “LOCAL” (PLS with selection of a calibration sample subset of the closest neighbours for each sample to predict). The Standard Errors of Prediction ( SEPs) are statistically tested and the results show that the regression methods are almost equal and that the data matrices are more important than the fitting methods themselves. The types of pre-treatments (Multiplicative Scatter Correction, Detrend, Standard Normal Variate, derivative etc.) of the spectra are too numerous to be able to test all the combinations. For each test, the pre-treatment found as the best with the PLS method is fixed for the other ones. The second part of the paper emphasises the importance of the number of samples. If any agricultural commodity, and probably any kind of product measured by an NIR instrument, can be considered as a mixture of several constituents, the databases built by collecting actual samples bringing new information can reach hundreds, if not thousands, of samples.
Cited by
123 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献