Synchronously Predicting Tea Polyphenol and Epigallocatechin Gallate in Tea Leaves Using Fourier Transform–Near-Infrared Spectroscopy and Machine Learning

Author:

Ye Sitan1,Weng Haiyong23,Xiang Lirong4,Jia Liangquan5,Xu Jinchai23

Affiliation:

1. School of Engineering, Newcastle University, Newcastle upon Tyne NE1 7RU, UK

2. Fujian Key Laboratory of Agricultural Information Sensoring Technology, College of Mechanical and Electrical Engineering, Fujian Agriculture and Forestry University, Fuzhou 350002, China

3. School of Future Technology, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China

4. Department of Biological and Agricultural Engineering, North Carolina State University, Raleigh, NC 27606, USA

5. School of Information Engineering, Huzhou University, Huzhou 313000, China

Abstract

Tea polyphenol and epigallocatechin gallate (EGCG) were considered as key components of tea. The rapid prediction of these two components can be beneficial for tea quality control and product development for tea producers, breeders and consumers. This study aimed to develop reliable models for tea polyphenols and EGCG content prediction during the breeding process using Fourier Transform–near infrared (FT-NIR) spectroscopy combined with machine learning algorithms. Various spectral preprocessing methods including Savitzky–Golay smoothing (SG), standard normal variate (SNV), vector normalization (VN), multiplicative scatter correction (MSC) and first derivative (FD) were applied to improve the quality of the collected spectra. Partial least squares regression (PLSR) and least squares support vector regression (LS-SVR) were introduced to establish models for tea polyphenol and EGCG content prediction based on different preprocessed spectral data. Variable selection algorithms, including competitive adaptive reweighted sampling (CARS) and random forest (RF), were further utilized to identify key spectral bands to improve the efficiency of the models. The results demonstrate that the optimal model for tea polyphenols calibration was the LS-SVR with Rp = 0.975 and RPD = 4.540 based on SG-smoothed full spectra. For EGCG detection, the best model was the LS-SVR with Rp = 0.936 and RPD = 2.841 using full original spectra as model inputs. The application of variable selection algorithms further improved the predictive performance of the models. The LS-SVR model for tea polyphenols prediction with Rp = 0.978 and RPD = 4.833 used 30 CARS-selected variables, while the LS-SVR model build on 27 RF-selected variables achieved the best predictive ability with Rp = 0.944 and RPD = 3.049, respectively, for EGCG prediction. The results demonstrate a potential of FT-NIR spectroscopy combined with machine learning for the rapid screening of genotypes with high tea polyphenol and EGCG content in tea leaves.

Funder

Integrate Interdisciplinary Disciplines to Promote the Development of Smart Agriculture

Science and Technology Innovation Special Foundation of Fujian Agriculture and Forestry University

Fujian Key Laboratory of Agricultural Information Sensoring Technology

Publisher

MDPI AG

Subject

Chemistry (miscellaneous),Analytical Chemistry,Organic Chemistry,Physical and Theoretical Chemistry,Molecular Medicine,Drug Discovery,Pharmaceutical Science

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3