Multi-armed bandit algorithm for sequential experiments of molecular properties with dynamic feature selection

Author:

Abedin Md. Menhazul12ORCID,Tabata Koji345ORCID,Matsumura Yoshihiro5ORCID,Komatsuzaki Tamiki13567ORCID

Affiliation:

1. Graduate School of Chemical Sciences and Engineering, Hokkaido University 1 , Sapporo 060-8628, Japan

2. Khulna University 2 , Khulna 9208, Bangladesh

3. Research Institute for Electronic Science, Hokkaido University 3 , Sapporo 001-0020, Japan

4. Department of Mathematics, Hokkaido University 4 , Sapporo 060-0810, Japan

5. Institute for Chemical Reaction Design and Discovery (ICReDD), Hokkaido University 5 , Sapporo 001-0020, Japan

6. Institute for Open and Transdisciplinary Research Initiatives, Osaka University 6 , Yamadaoka, Suita 565-0871, Osaka, Japan

7. The Institute of Scientific and Industrial Research, Osaka University 7 , 8-1 Mihogaoka, Ibaraki 567-0047, Osaka, Japan

Abstract

Sequential optimization is one of the promising approaches in identifying the optimal candidate(s) (molecules, reactants, drugs, etc.) with desired properties (reaction yield, selectivity, efficacy, etc.) from a large set of potential candidates, while minimizing the number of experiments required. However, the high dimensionality of the feature space (e.g., molecular descriptors) makes it often difficult to utilize the relevant features during the process of updating the set of candidates to be examined. In this article, we developed a new sequential optimization algorithm for molecular problems based on reinforcement learning, multi-armed linear bandit framework, and online, dynamic feature selections in which relevant molecular descriptors are updated along with the experiments. We also designed a stopping condition aimed to guarantee the reliability of the chosen candidate from the dataset pool. The developed algorithm was examined by comparing with Bayesian optimization (BO), using two synthetic datasets and two real datasets in which one dataset includes hydration free energy of molecules and another one includes a free energy difference between enantiomer products in chemical reaction. We found that the dynamic feature selection in representing the desired properties along the experiments provides a better performance (e.g., time required to find the best candidate and stop the experiment) as the overall trend and that our multi-armed linear bandit approach with a dynamic feature selection scheme outperforms the standard BO with fixed feature variables. The comparison of our algorithm to BO with dynamic feature selection is also addressed.

Funder

Japan Science and Technology Agency

Japan Society for the Promotion of Science

Japan Agency for Medical Research and Development

Publisher

AIP Publishing

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3