Causal Artificial Intelligence Models of Food Quality Data

Author:

Kurtanjek Želimir1ORCID

Affiliation:

1. University of Zagreb Faculty of Food Technology and Biotechnology, Pierrotijeva 6, 10000 Zagreb, Croatia

Abstract

Research background. The aim of this study is to emphasize the importance of artificial intelligence (AI) and causality modelling of food quality and analysis with ’big data’. AI with structural causal modelling (SCM), based on Bayesian networks and deep learning, enables the integration of theoretical field knowledge in food technology with process production, physicochemical analytics and consumer organoleptic assessments. Food products have complex nature and data are highly dimensional, with intricate interrelations (correlations) that are difficult to relate to consumer sensory perception of food quality. Standard regression modelling techniques such as multiple ordinary least squares (OLS) and partial least squares (PLS) are effectively applied for the prediction by linear interpolations of observed data under cross-sectional stationary conditions. Upgrading linear regression models by machine learning (ML) accounts for nonlinear relations and reveals functional patterns, but is prone to confounding and failed predictions under unobserved nonstationary conditions. Confounding of data variables is the main obstacle to applications of the regression models in food innovations under previously untrained conditions. Hence, this manuscript focuses on applying causal graphical models with Bayesian networks to infer causal relationships and intervention effects between process variables and consumer sensory assessment of food quality. Experimental approach. This study is based on the data available in the literature on the process of wheat bread baking quality, consumer sensory quality assessments of fermented milk products, and professional wine tasting data. The data for wheat baking quality were regularized by the least absolute shrinkage and selection operator (LASSO elastic net). Bayesian statistics was applied for the evaluation of the model joint probability function for inferring the network structure and parameters. The obtained SCMs are presented as directed acyclic graphs (DAG). D-separation criteria were applied to block confounding effects in estimating direct and total causal effects of process variables and consumer perception on food quality. Probability distributions of causal effects of the intervention of individual process variables on quality are presented as partial dependency plots determined by Bayesian neural networks. In the case of wine quality causality, the total causal effects determined by SCMs are positively validated by the double machine learning (DML) algorithm. Results and conclusions. The data set of 45 continuous variables corresponding to different chemical, physical and biochemical variables of wheat properties from seven Croatian cultivars during two years of controlled cultivation were analysed. LASSO regularization of the data set yielded the ten key predictors, accounting for 98 % variance of the baking quality data. Based on the key variables, the quality predictive random forest model with 75 % cross-validation accuracy was derived. Causal analysis between the quality and key predictors was based on the Bayesian model shown as a DAG graph. Protein content shows the most important direct causal effect with the corresponding path coefficient of 0.71, and THMM (total high-molecular-mass glutenin subunits) content was an indirect cause with a path coefficient of 0.42, and protein total average causal effect (ACE) was 0.65. The large data set of the quality of fermented milk products included binary consumer sensory data (taste, odour, turbidity), continuous physical variables (temperature,fat, pH, colour) and three grade classes of products by consumer quality assessment. A random forest model was derived for the prediction of the quality classification with an out-of-bag (OOB) error of 0.28 %. The Bayesian network model predicts that the direct causes of the taste classification are temperature, colour and fat content, while the direct causes of the quality classification are temperature, turbidity, odour and fat content. The key quality grade ACE of temperature –0.04 grade/°C and 0.3 quality grade/fat content were estimated. The temperature ACE dependency shows a nonlinear type as negative saturation with the ’breaking’ point at 60 °C, while for fat ACE had a positive linear trend. Causal quality analysis of red and white wine was based on the large data set of eleven continuous variables of physical and chemical properties and quality assessments classified in ten classes, from 1 to 10. Each classification was obtained in triplicate by a panel of professional wine tasters. A non-structural double machine learning (DML) algorithm was applied for total ACE quality assessment. The alcohol content of red and white wine had the key positive ACE relative factor of 0.35 quality/alcohol, while volatile acidity had the key negative ACE of –0.2 quality/acidity. The obtained ACE predictions by the unstructured DML algorithm are in close agreement with the ACE obtained by the structural SCM. Novelty and scientific contribution. Novel methodologies and results for the application of causal artificial intelligence models in the analysis of consumer assessment of the quality of food products are presented. The application of Bayesian network structural causal models (SCM) enables the d-separation of pronounced effects of confounding between parameters in noncausal regression models. Based on the SCM, inference of ACE provides substantiated and validated research hypotheses for new products and support for decisions of potential interventions for improvement in product design, new process introduction, process control, management and marketing.

Publisher

Faculty of Food Technology and Biotechnology - University of Zagreb

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3