Affiliation:
1. Division of Computer Science and Engineering, Sahmyook University, Hwarangro 815, Seoul 01795, Republic of Korea
2. Department of Digital Contents Design, Ulsan College, Ulsan 44610, Republic of Korea
Abstract
Student dropout is a serious issue in that it not only affects the individual students who drop out but also has negative impacts on the former university, family, and society together. To resolve this, various attempts have been made to predict student dropout using machine learning. This paper presents a model to predict student dropout at Sahmyook University using machine learning. Academic records collected from 20,050 students of the university were analyzed and used for learning. Various machine learning algorithms were used to implement the model, including Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, Deep Neural Network, and LightGBM (Light Gradient Boosting Machine), and their performances were compared through experiments. We also discuss the influence of oversampling used to resolve data imbalance issues in the dropout data. For this purpose, various oversampling algorithms such as SMOTE, ADASYN, and Borderline-SMOTE were tested. Our experimental results showed that the proposed model implemented using LightGBM provided the best performance with an F1-score of 0.840, which is higher than the results of previous studies discussing the dropout prediction with the issue of class imbalance.
Funder
Sahmyook University Research Fund
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science