Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes

Author:

Khondkaryan Lusine12,Tevosyan Ani23,Navasardyan Hayk2,Khachatrian Hrant34,Tadevosyan Gohar12,Apresyan Lilit12,Chilingaryan Gayane3,Navoyan Zaven2,Stopper Helga5ORCID,Babayan Nelly12ORCID

Affiliation:

1. Institute of Molecular Biology, NAS RA, Yerevan 0014, Armenia

2. Toxometris.ai, Yerevan 0009, Armenia

3. YerevaNN, Yerevan 0025, Armenia

4. Department of Informatics and Applied Mathematics, Yerevan State University, Yerevan 0025, Armenia

5. Institute of Pharmacology and Toxicology, University of Würzburg, 97078 Würzburg, Germany

Abstract

In silico (quantitative) structure–activity relationship modeling is an approach that provides a fast and cost-effective alternative to assess the genotoxic potential of chemicals. However, one of the limiting factors for model development is the availability of consolidated experimental datasets. In the present study, we collected experimental data on micronuclei in vitro and in vivo, utilizing databases and conducting a PubMed search, aided by text mining using the BioBERT large language model. Chemotype enrichment analysis on the updated datasets was performed to identify enriched substructures. Additionally, chemotypes common for both endpoints were found. Five machine learning models in combination with molecular descriptors, twelve fingerprints and two data balancing techniques were applied to construct individual models. The best-performing individual models were selected for the ensemble construction. The curated final dataset consists of 981 chemicals for micronuclei in vitro and 1309 for mouse micronuclei in vivo, respectively. Out of 18 chemotypes enriched in micronuclei in vitro, only 7 were found to be relevant for in vivo prediction. The ensemble model exhibited high accuracy and sensitivity when applied to an external test set of in vitro data. A good balanced predictive performance was also achieved for the micronucleus in vivo endpoint.

Funder

RA MES (Republic of Armenia, Ministry of Education and Science) State Committee of Science

Publisher

MDPI AG

Subject

Chemical Health and Safety,Health, Toxicology and Mutagenesis,Toxicology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3