A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration

Author:

Zhong Liang1,Huang Ruiqi1,Gao Lele1,Yue Jianan1,Zhao Bing1,Nie Lei1,Li Lian1ORCID,Wu Aoli1,Zhang Kefan1,Meng Zhaoqing2,Cao Guiyun2,Zhang Hui13,Zang Hengchang134

Affiliation:

1. NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan 250012, China

2. Shandong Hongjitang Pharmaceutical Group Co. Ltd., Jinan 250103, China

3. National Glycoengineering Research Center, Shandong University, Jinan 250012, China

4. Key Laboratory of Chemical Biology, Ministry of Education, Shandong University, Jinan 250012, China

Abstract

Variable (wavelength) selection is essential in the multivariate analysis of near-infrared spectra to improve model performance and provide a more straightforward interpretation. This paper proposed a new variable selection method named binning-normalized mutual information (B-NMI) based on information entropy theory. “Data binning” was applied to reduce the effects of minor measurement errors and increase the features of near-infrared spectra. “Normalized mutual information” was employed to calculate the correlation between each wavelength and the reference values. The performance of B-NMI was evaluated by two experimental datasets (ideal ternary solvent mixture dataset, fluidized bed granulation dataset) and two public datasets (gasoline octane dataset, corn protein dataset). Compared with classic methods of backward and interval PLS (BIPLS), variable importance projection (VIP), correlation coefficient (CC), uninformative variables elimination (UVE), and competitive adaptive reweighted sampling (CARS), B-NMI not only selected the most featured wavelengths from the spectra of complex real-world samples but also improved the stability and robustness of variable selection results.

Funder

Key R&D Program of Shandong Province

National Key Research and Development Program of China

Major industrial research project for the transformation of new and old kinetic energy of Shandong Province

Shandong Province Natural Science Foundation

Major Scientific and Technological Innovation Project of Shandong Province

Publisher

MDPI AG

Subject

Chemistry (miscellaneous),Analytical Chemistry,Organic Chemistry,Physical and Theoretical Chemistry,Molecular Medicine,Drug Discovery,Pharmaceutical Science

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3