Affiliation:
1. Departamento de Química Física, Facultade de Ciencias, Universidade de Vigo, 32004 Ourense, Spain
Abstract
The shiitake mushroom has gained popularity in the last decade, ranking second in the world for mushrooms consumed, providing consumers with a wide variety of nutritional and healthy benefits. It is often not clear the origin of these mushrooms, so it becomes of great importance to the consumers. In this research, different machine learning algorithms were developed to determine the geographical origin of shiitake mushrooms (Lentinula edodes) consumed in Korea, based on experimental data reported in the literature (δ13C, δ15N, δ18O, δ34S, and origin). Regarding the origin of shiitake in three categories (Korean, Chinese, and mushrooms from Chinese inoculated sawdust blocks), the random forest model presents the highest accuracy value (0.940) and the highest kappa value (0.908) for the validation phase. To determine the origin of shiitake mushrooms in two categories (Korean and Chinese, including mushrooms from Chinese inoculated sawdust blocks in the latter ones), the support vector machine model is chosen as the best model due to the high accuracy (0.988) and kappa (0.975) values for the validation phase. Finally, to determine the origin in two categories (Korean and Chinese, but this time including the mushrooms from Chinese inoculated sawdust blocks in the Korean ones), the best model is the random forest due to its higher accuracy value (0.952) in the validation phase (kappa value of 0.869). The accuracy values in the testing phase for the best selected models are acceptable (between 0.839 and 0.964); therefore, the predictive capacity of the models could be acceptable for their use in real applications. This allows us to affirm that machine learning algorithms would be suitable modeling instruments to determine the geographical origin of shiitake.