Affiliation:
1. State Key Laboratory of Chemical Engineering, School of Chemical Engineering East China University of Science and Technology Shanghai China
2. Process Systems Engineering Max Planck Institute for Dynamics of Complex Technical Systems Magdeburg Germany
3. Engineering Research Center of Resource Utilization of Carbon‐containing Waste with Carbon Neutrality (Ministry of Education) East China University of Science and Technology Shanghai China
4. Process Systems Engineering Otto‐von‐Guericke University Magdeburg Magdeburg Germany
Abstract
AbstractThis work introduced a scalable and integrated machine learning (ML) framework to facilitate important steps of building quantitative structure–property relationship (QSPR) models for molecular property prediction. Specifically, the molecular descriptor generation, feature engineering, ML model training, model selection and ensembling, as well as model validation and timing, are integrated into a single workflow within the proposed framework. Unlike existing modeling methods relying upon human experts that primarily focus on model/hyperparameter selection, the proposed framework succeeds by ensembling multiple models and stacking them in multiple layers. The high efficiency and effectiveness of the proposed framework are demonstrated through comparisons with literature‐reported QSPR models using identical datasets in three property modeling case studies, that is, the flash point temperature, the melting temperature, and the octanol–water partition coefficients. While requiring much less modeling time, the resultant models by the proposed framework present better predictive performance as compared with the reference models in all three case studies.
Funder
National Natural Science Foundation of China
Subject
General Chemical Engineering,Environmental Engineering,Biotechnology
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献