OCS-TGBM: Intelligent Analysis of Organic Chemical Synthesis Based on Topological Data Analysis and LightGBM
-
Published:2023-12
Issue:3
Volume:91
Page:557-592
-
ISSN:0340-6253
-
Container-title:MATCH – Communications in Mathematical and in Computer Chemistry
-
language:
-
Short-container-title:MATCH
Author:
Guo Yanhui, ,Peng Lichao,Li Zixin,Yu Mengen,Jiao Xue,Chai Yun,Yang Xiaohui, , , , , ,
Abstract
Organic synthesis has been widely used in drug discovery and development. The intelligent prediction and analysis of high-throughput coupling reaction yield is one of the important and challenging research hotspots in the field of organic synthesis. However, the existing methods focus on intelligent prediction rather than study and interpret the internal relationship between reaction conditions and yield. For tackling this problem, an intelligent analysis organic chemical synthesis model by combining topological data analysis (TDA) and Light Gradient Boosting Machine (LightGBM), named OCS-TGBM, is proposed to deeply explore the internal relationship between reaction conditions and yield, and obtain high-yield reaction conditions and combinations. In order to further enhance the performance of the OCS-TGBM model, a stratified diversity sampling strategy is introduced. Experimental results show that the OCS-TGBM model is superior to other methods in analyzing and predicting the reaction performance of high-throughput organic chemical synthesis. And it provides intelligent assistance for the optimal design of the reaction system and the evaluation of reaction conditions, thus greatly accelerating the process of the drug discovery and development.
Publisher
University Library in Kragujevac
Subject
Applied Mathematics,Computational Theory and Mathematics,Computer Science Applications,General Chemistry