Abstract
We propose a virtual screening method based on imbalanced data mining in this paper, which combines virtual screening techniques with imbalanced data classification methods to improve the traditional virtual screening process. First, in the actual virtual screening process, we apply k-means and smote heuristic oversampling method to deal with imbalanced data. Meanwhile, to enhance the accuracy of the virtual screening process, a particle swarm optimization algorithm is introduced to optimize the parameters of the support vector machine classifier, and the concept of ensemble learning is brought in. The classification technique based on particle swarm optimization, support vector machine and adaptive boosting is used to screen the molecular docking conformation to improve the accuracy of the prediction. Finally, in the experimental construction and analysis section, the proposed method was validated using relevant data from the protein data bank database and PubChem database. The experimental results indicated that the proposed method can effectively improve the accuracy of virus screening and has practical guidance for new drug development. This research regards virtual screening as a problem of imbalanced data classification, which has obvious guiding significance and also provides a certain reference for the problems faced by virtual screening technology.
Funder
National Natural Science Foundation of China
the special projects for the central government to guide the development of local sci-ence and technology
Subject
Process Chemistry and Technology,Chemical Engineering (miscellaneous),Bioengineering
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献