Comparison of Random Forest and Support Vector Machine for Indonesian Tweet Complaint Classification-Reference-Cited by-同舟云学术

Comparison of Random Forest and Support Vector Machine for Indonesian Tweet Complaint Classification

Published:2019-12-05 Issue: Volume: Page:202-207
ISSN:2456-3307
Container-title:International Journal of Scientific Research in Computer Science, Engineering and Information Technology
language:en
Short-container-title:IJSRCSEIT

Author:

Ramayanti Desi¹

Affiliation:

1. Faculty of Computer Science, Universitas Mercu Buana, Jakarta Barat, Indonesia

Abstract

In digital business, the managerial commonly need to process text so that it can be used to support decision-making. The number of text documents contained ideas and opinions is progressing and challenging to understand one by one. Whereas if the data are processed and correctly rendered using machine learning, it can present a general overview of a particular case, organization, or object quickly. Numerous researches have been accomplished in this research area, nevertheless, most of the studies concentrated on English text classification. Every language has various techniques or methods to classify text depending on the characteristics of its grammar. The result of classification among languages may be different even though it used the same algorithm. Given the greatness of text classification, text classification algorithms that can be implemented is the support vector machine (SVM) and Random Forest (RF). Based on the background above, this research is aimed to find out the performance of support vector machine algorithm and random forest in classification of Indonesian text. 1. Result of SVM classifier with cross validation k-10 is derived the best accuracy with value 0.9648, however, it spends computational time as long as 40.118 second. Then, result of RF classifier with values, i.e. 'bootstrap': False, 'min_samples_leaf': 1, 'n_estimators': 10, 'min_samples_split': 3, 'criterion': 'entropy', 'max_features': 3, 'max_depth': None is achieved accuracy is 0.9561 and computational time 109.399 second.

Publisher

Technoscience Academy

Subject

General Medicine

Reference19 articles.

1. W. P. Sari, E. Cahyaningsih, D. I. Sensuse, and H. Noprisson, “The welfare classification of Indonesian national civil servant using TOPSIS and k-Nearest Neighbour (KNN),” in Research and Development (SCOReD), 2016 IEEE Student Conference on, 2016, pp. 1-5.

2. V. Ayumi, “Pose-based Human Action Recognition with Extreme Gradient Boosting,” 2016.

3. J. Dai and X. Liu, “Approach for Text Classification Based on the Similarity Measurement between Normal Cloud Models,” Sci. World J., 2014.

4. T. Joachims, “Text categorization with support vector machines: Learning with many relevant features,” in In Proceedings of the 10th European Conference on Machine Learning.

5. N. Boudad, R. Faizi, R. O. H. Thami, and R. Chiheb, “Sentiment analysis in Arabic: A review of the literature,” Ain Shams Eng. J., 2017.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Superpixel segmentation integrated feature subset selection for wetland classification over Yellow River Delta;Environmental Science and Pollution Research;2023-02-17

2. Performance Evaluation of Support Vector Machine Algorithm for Human Gesture Recognition;International Journal of Scientific Research in Science, Engineering and Technology;2020-12-10