Abstract
Educational Data Mining (EDM) is a branch of data mining that focuses on extraction of useful knowledge from data generated through academic activities at school, college or at university level. The extracted knowledge can help to perform the academic activities in a better way, so it is useful for students, parents and institutions themselves. One common activity in EDM is students grade prediction with an aim to identify weak or at-risk students. An early identification of such students helps to take supportive measures that may help students to improve. Among a vast number of approaches available in this field, this study mainly focuses on generating a smarter dataset through reduced feature set without compromising the number of records in it and then producing an approach which combines the strengths of classification and clustering for better prediction results. In this study it has been identified that individual features have distinct effect and that removing misclassified data can affect the overall results. Backward selection is adopted using Pearson correlation as a metric to produce smarter dataset with lesser attributes and better accuracy in prediction. After feature set selection, we have applied EMT (Ensemble Meta-Based Tree Model) classification on it to identify best performing classifiers from five families of classifiers. In hybrid approach, first the ensemble clustering is applied on smart dataset and then EMT classification is applied to reevaluate the un-clustered data, which gives a boost in performance and provides us an accuracy of 93%.
Subject
General Computer Science,Theoretical Computer Science
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献