An ensemble machine learning model generates a focused screening library for the identification of CDK8 inhibitors

Author:

Lin Tony Eight12ORCID,Yen Dyan1,HuangFu Wei‐Chun123ORCID,Wu Yi‐Wen1ORCID,Hsu Jui‐Yi12ORCID,Yen Shih‐Chung4ORCID,Sung Tzu‐Ying5,Hsieh Jui‐Hua6ORCID,Pan Shiow‐Lin123ORCID,Yang Chia‐Ron7ORCID,Huang Wei‐Jan8ORCID,Hsu Kai‐Cheng1239ORCID

Affiliation:

1. Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology Taipei Medical University Taipei Taiwan

2. Ph.D. Program for Cancer Molecular Biology and Drug Discovery College of Medical Science and Technology, Taipei Medical University Taipei Taiwan

3. TMU Research Center of Cancer Translational Medicine Taipei Medical University Taipei Taiwan

4. Warshel Institute for Computational Biology The Chinese University of Hong Kong (Shenzhen) Shenzhen Guangdong People's Republic of China

5. Biomedical Translation Research Center, Academia Sinica Taipei Taiwan

6. Division of Translational Toxicology National Institute of Environmental Health Sciences, National Institutes of Health Durham North Carolina USA

7. School of Pharmacy, College of Medicine National Taiwan University Taipei Taiwan

8. Graduate Institute of Pharmacognosy, College of Pharmacy Taipei Medical University Taipei Taiwan

9. Cancer Center, Wan Fang Hospital Taipei Medical University Taipei Taiwan

Abstract

AbstractThe identification of an effective inhibitor is an important starting step in drug development. Unfortunately, many issues such as the characterization of protein binding sites, the screening library, materials for assays, etc., make drug screening a difficult proposition. As the size of screening libraries increases, more resources will be inefficiently consumed. Thus, new strategies are needed to preprocess and focus a screening library towards a targeted protein. Herein, we report an ensemble machine learning (ML) model to generate a CDK8‐focused screening library. The ensemble model consists of six different algorithms optimized for CDK8 inhibitor classification. The models were trained using a CDK8‐specific fragment library along with molecules containing CDK8 activity. The optimized ensemble model processed a commercial library containing 1.6 million molecules. This resulted in a CDK8‐focused screening library containing 1,672 molecules, a reduction of more than 99.90%. The CDK8‐focused library was then subjected to molecular docking, and 25 candidate compounds were selected. Enzymatic assays confirmed six CDK8 inhibitors, with one compound producing an IC50 value of ≤100 nM. Analysis of the ensemble ML model reveals the role of the CDK8 fragment library during training. Structural analysis of molecules reveals the hit compounds to be structurally novel CDK8 inhibitors. Together, the results highlight a pipeline for curating a focused library for a specific protein target, such as CDK8.

Funder

National Science and Technology Council

Publisher

Wiley

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3