QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping

Author:

Škuta C.ORCID,Cortés-Ciriano I.ORCID,Dehaen W.ORCID,Kříž P.ORCID,van Westen G. J. P.ORCID,Tetko I. V.ORCID,Bender A.ORCID,Svozil D.ORCID

Abstract

AbstractAn affinity fingerprint is the vector consisting of compound’s affinity or potency against the reference panel of protein targets. Here, we present the QAFFP fingerprint, 440 elements long in silico QSAR-based affinity fingerprint, components of which are predicted by Random Forest regression models trained on bioactivity data from the ChEMBL database. Both real-valued (rv-QAFFP) and binary (b-QAFFP) versions of the QAFFP fingerprint were implemented and their performance in similarity searching, biological activity classification and scaffold hopping was assessed and compared to that of the 1024 bits long Morgan2 fingerprint (the RDKit implementation of the ECFP4 fingerprint). In both similarity searching and biological activity classification, the QAFFP fingerprint yields retrieval rates, measured by AUC (~ 0.65 and ~ 0.70 for similarity searching depending on data sets, and ~ 0.85 for classification) and EF5 (~ 4.67 and ~ 5.82 for similarity searching depending on data sets, and ~ 2.10 for classification), comparable to that of the Morgan2 fingerprint (similarity searching AUC of ~ 0.57 and ~ 0.66, and EF5 of ~ 4.09 and ~ 6.41, depending on data sets, classification AUC of ~ 0.87, and EF5 of ~ 2.16). However, the QAFFP fingerprint outperforms the Morgan2 fingerprint in scaffold hopping as it is able to retrieve 1146 out of existing 1749 scaffolds, while the Morgan2 fingerprint reveals only 864 scaffolds.

Funder

Ministry of Education, Youth and Sports of the Czech Republic

H2020 Marie Skłodowska-Curie Actions

FP7 People: Marie-Curie Actions

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Computer Graphics and Computer-Aided Design,Physical and Theoretical Chemistry,Computer Science Applications

Reference116 articles.

1. Tanrikulu Y, Kruger B, Proschak E (2013) The holistic integration of virtual screening in drug discovery. Drug Discov Today 18(7–8):358–364

2. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742–754

3. Consonni V, Todeschini R (2000) Handbook of molecular descriptors. Wiley-VCH, New York

4. Wassermann AM, Lounkine E, Davies JW, Glick M, Camargo LM (2015) The opportunities of mining historical and collective data in drug discovery. Drug Discov Today 20(4):422–434

5. Paricharak S, Mendez-Lucio O, Chavan Ravindranath A, Bender A, Ijzerman AP, van Westen GJ (2016) Data-driven approaches used for compound library design, hit triage and bioactivity modeling in high-throughput screening. Brief Bioinform 19(2):277–285

Cited by 26 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3