Affiliation:
1. State Key Laboratory of Animal Biotech Breeding National Engineering Laboratory of Animal Breeding Key Laboratory of Animal Genetics, Breeding and Reproduction MARA College of Animal Science and Technology China Agricultural University Beijing China
2. Department of Animal Sciences Purdue University West Lafayette Indiana USA
3. Institute of Animal Science and Veterinary Medicine Shandong Academy of Agricultural Sciences Jinan China
4. Department of Comparative Biomedicine and Food Science University of Padova Padova Italy
Abstract
AbstractMilk mid‐infrared (MIR) spectra have been shown to provide valuable information on a wide range of traits to be used in dairy cattle breeding programs. Selecting the most informative variables from complex data can improve the prediction accuracy and model robustness and, consequently, the interpretability of MIR spectra. Thus, we aimed to investigate the prediction performance of feature selection methods based on MIR spectra data, using the milk fatty acid (FA) profile as an example to illustrate the evaluated procedure. Data of MIR spectra, milk test‐day records, and reference FA concentrations of 155 first‐parity Holstein cows were used in the analyses. Four models comprising different explanatory variables and three feature selection methods were evaluated. The results indicated that competitive adaptive reweighted sampling (CARS) method can effectively select the most informative variables from the MIR spectra, resulting in higher prediction accuracies than other variable selection approaches. The model including selected MIR spectra and cow information variables yielded the best FA profile predictions based on partial least square regression. C8:0, C10:0, C14:1, C17:0 isomers, C18:1, C18:1 isomer, medium‐chain FA, unsaturation FA, monounsaturated FA, and polyunsaturated FA presented accuracies based on the determination coefficient ranging from 0.66 to 0.85 in internal validation and from 0.65 to 0.84 in external validation. The most related wavenumbers to 35 FAs were found within 1003 to 1145 cm−1. Generally, using CARS and cow information improved predictions of FAs based on MIR spectra in Chinese Holstein dairy cows. Additional validation studies should be conducted as larger datasets become available.
Funder
Natural Science Foundation of Shandong Province
National Natural Science Foundation of China
National Key Research and Development Program of China
Earmarked Fund for China Agriculture Research System