Author:
Falát Lukáš,Piscová Terézia
Abstract
The paper deals with predicting grade point average (GPA) with supervised machine learning models. Based on the literature review, we divide the factors into three groups—psychological, sociological and study factors. Data from the questionnaire are evaluated using statistical analysis. We use confirmatory data analysis, where we compare the answers of men and women, university students coming from grammar schools versus students coming from secondary vocational schools and students divided according to the average grade. The differences between groups are tested with the Shapiro–Wilk and Mann–Whitney U-test. We identify the factors influencing the GPA through correlation analysis, where we use the Pearson test and the ANOVA. Based on the performed analysis, factors that show a statistically significant dependence with the GPA are identified. Subsequently, we implement supervised machine learning models. We create 10 prediction models using linear regression, decision trees and random forest. The models predict the GPA based on independent variables. Based on the MAPE metric on the five validation sets in cross-validation, the best generalization accuracy is achieved by a random forest model—its average MAPE is 11.13%. Therefore, we recommend the use of a random forest as a starting model for modeling student results.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献