Abstract
Background
The dataset from genes used to predict hepatitis C virus outcome was evaluated in a previous study using a conventional statistical methodology.
Objective
The aim of this study was to reanalyze this same dataset using the data mining approach in order to find models that improve the classification accuracy of the genes studied.
Methods
We built predictive models using different subsets of factors, selected according to their importance in predicting patient classification. We then evaluated each independent model and also a combination of them, leading to a better predictive model.
Results
Our data mining approach identified genetic patterns that escaped detection using conventional statistics. More specifically, the partial decision trees and ensemble models increased the classification accuracy of hepatitis C virus outcome compared with conventional methods.
Conclusions
Data mining can be used more extensively in biomedicine, facilitating knowledge building and management of human diseases.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献