Affiliation:
1. Tianjin Key Laboratory of Molecular Design and Drug Discovery, Tianjin Institute of Pharmaceutical Research,
306 Huiren Road, Tianjin, 300301, P.R. China
Abstract
Aims:
The machine learning-based QSAR modeling procedure, molecular generations,
and molecular dynamic simulations were applied to virtually screen the DNA polymerase
theta inhibitors.
Background:
The DNA polymerase theta (Polθ or POLQ) is an attractive target for treatments
of homologous recombination deficient (such as BRCA deficient) cancers. There are no approved
drugs for targeting POLQ, and only one inhibitor is in Phase Ⅱclinical trials; thus, it is
necessary to develop novel POLQ inhibitors.
Objectives:
To build machine learning models that predict the bioactivities of POLQ inhibitors.
To build molecular generation models that generate diverse molecules. To virtually screen the
generated molecules by the machine learning models. To analyze the binding modes of the
screening results by molecular dynamic simulations.
Methods:
In the present work, 325 inhibitors with POLQ polymerase domain bioactivities were
Collected. Two machine learning methods, random forest and deep neural network, were used
for building the ligand- and structure-based quantitative structure-activity relationship (QSAR)
models. The substructure replacement-based method and transfer learning-based deep recurrent
neural network method were used for molecular generations. Molecular docking and consensus
QSAR models were carried out for virtual screening. The molecular dynamic simulations and
MM/GBSA binding free energy calculation and decomposition were used to further analyze the
screening results.
result:
The MCC values of the best ligand- and structure-based consensus QSAR models reached 0.651 and 0.361 for the test set, respectively. The machine learning-based docking scores had better predicted ability to distinguish the highly and weakly active poses than the original docking scores. The 96490 molecules were generated by both molecular generation methods, and 10 molecules were retained by virtual screening. Four favorable interactions were concluded by molecular dynamic simulations.
Results:
The MCC values of the best ligand- and structure-based consensus QSAR models
reached 0.651 and 0.361 for the test set, respectively. The machine learning-based docking
scores had better-predicted ability to distinguish the highly and weakly active poses than the
original docking scores. The 96490 molecules were generated by both molecular generation
methods, and 10 molecules were retained by virtual screening. Four favorable interactions were
concluded by molecular dynamic simulations.
Conclusion:
We hope that the screening results and the binding modes are helpful for designing
the highly active POLQ polymerase inhibitors and the models of the molecular design workflow
can be used as reliable tools for drug design.
Publisher
Bentham Science Publishers Ltd.