Abstract
Meiotic recombination has a crucial role in the biological process involving double-strand DNA breaks. Recombination hotspots are regions with a size varying from 1 to 2 kb, which is closely related to the double-strand breaks. With the increasement of both sperm data and population data, it has been demonstrated that computational methods can help us to identify the recombination spots with the advantages of time-saving and cost-saving compared to experimental verification approaches. To obtain better identification performance and investigate the potential role of various DNA sequence-derived features in building computational models, we designed a computational model by extracting features including the position-specific trinucleotide propensity (PSTNP) information, the electron-ion interaction potential (EIIP) values, nucleotide composition (NC) and dinucleotide composition (DNC). Finally, the supporting vector machine (SVM) model was trained by using the 172-dimensional features selected by means of the F-score feature ranking mode, and the accuracy of the predictor reached 98.24% in the jackknife test, which elucidates this model is a potential way for identifying recombination spots.
Subject
Computational Mathematics,Computer Science Applications,General Engineering
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献