Abstract
AbstractArtificial intelligence based chemistry models are a promising method of exploring chemical reaction design spaces. However, training datasets based on experimental synthesis are typically reported only for the optimal synthesis reactions. This leads to an inherited bias in the model predictions. Therefore, robust datasets that span the entirety of the solution space are necessary to remove inherited bias and permit complete training of the space. In this study, an artificial intelligence model based on a Variational AutoEncoder (VAE) has been developed and investigated to synthetically generate continuous datasets. The approach involves sampling the latent space to generate new chemical reactions. This developed technique is demonstrated by generating over 7,000,000 new reactions from a training dataset containing only 7,000 reactions. The generated reactions include molecular species that are larger and more diverse than the training set.
Funder
National Science Foundation
Publisher
Springer Science and Business Media LLC
Subject
Materials Chemistry,Biochemistry,Environmental Chemistry,General Chemistry
Reference58 articles.
1. Cova, T. & Pais, A. Deep learning for deep chemistry: optimizing the prediction of chemical patterns. Front. Chem. 7, 809 (2019).
2. Li, Z., Ma, X. & Xin, H. Feature engineering of machine-learning chemisorption models for catalyst design. Catal. Today 280, 232–238 (2017).
3. Kang, P. -L. & Liu, Z. -P. Reaction prediction via atomistic simulation: from quantum mechanics to machine learning. Iscience 24, 102013 (2020).
4. Kayala, M. & Baldi, P. A machine learning approach to predict chemical reactions. Adv. Neural Inf. Process. Syst. 24, 747–755 (2011).
5. Toniato, A., Schwaller, P., Cardinale, A., Geluykens, J. & Laino, T. Unassisted noise reduction of chemical reaction datasets. Nat. Mach. Intell. 3, 485–494 (2021).
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献