Affiliation:
1. Kamaraj college of Engineering and Technology
Abstract
Abstract
Data mining in the classroom is one of the prominent fields which involves data mining concepts, statistical analysis, and machine learning concepts which all gets applied on the educational data. These EDM processed data are widely used for analysing the various aspects of the business and process model. Existing and conventional process model involves the usage of conventional statistical techniques to process the data which in-turn needs a lot of manual interventional for data modelling and pre-processing. To address the above-mentioned issues this paper proposes a novel technique which combines the machine learning model along with the statistical approaches. This machine learning combination involves the ensembling different classifiers such as Decision tree, logistic regression, K nearest Neighbour, Random Forest, multiplayer perceptron etc. The data which utilized in the experimentation is highly imbalanced due to the limited data availability. Hence the above claimed technique is combined with the universally benchmarked model called synthetic minority oversampling technique (SMOTE) to address the problem of class imbalance. Further, the performance evaluation is also statistically performed in order to prove the efficacy of the proposed technique.
Publisher
Research Square Platform LLC
Reference24 articles.
1. Sapkota N, Alsadoon A, Prasad PWC, Elchouemi A, Singh AK. "Data Summarization Using Clustering and Classification: Spectral Clustering Combined with k-Means Using NFPH," 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 2019, pp. 146–151, 10.1109/COMITCon.2019.8862218.
2. Li H, Lu Q. "K-CV parameter optimization method in the application of SVM classification data," 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), Beijing, 2017, pp. 25–29, 10.1109/ICBDA.2017.8078838.
3. Chandra S, Kaur M. "Creation of an Adaptive Classifier to enhance the classification accuracy of existing classification algorithms in the field of Medical Data Mining," 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, 2015, pp. 376–381.
4. Okfalisa I, Gazalba, Mustakim, Reza NGI, "Comparative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification," 2017 2nd International conferences on Information Technology, Information Systems and, Engineering E. (ICITISEE), Yogyakarta, 2017, pp. 294–298, 10.1109/ICITISEE.2017.8285514.
5. Pristyanto Y, Pratama I, Nugraha AF. "Data level approach for imbalanced class handling on educational data mining multiclass classification," 2018 International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, 2018, pp. 310–314, 10.1109/ICOIACT.2018.8350792.