Author:
Anugerah Ayu Media,Haris Muhendra Abdul
Abstract
Nowadays, many people express their evaluations on certain issues via social media freely, which makes huge amounts of data generated every day on social media. On Twitter, public opinions are diverse, which makes them possible to be processed for sentiment analysis. However, many people conveniently use slang words in expressing their opinions on Twitter. These slang words in the text can sometimes lead to miscalculation of language processing due to the absence of the “real words.” This research aimed to investigate the effect of adding slang words as part of the preprocessing stage to the performance of the conducted sentiment analysis. The sentiment analysis was performed using Naïve Bayes Classifier as the classification algorithm with term frequency-inverse document frequency (TF-IDF) as the feature extraction. The research focused on comparing the performance of the conducted sentiment analysis on data that was preprocessed using slang dictionary and the ones that did not use slang dictionary. The case used in this research was texts related to COVID-19 pandemic in Indonesia, especially the ones related to the implementation of vaccines. The performance evaluation results indicate that sentiment analysis of data preprocessed using slang word dictionary has shown better accuracy than the ones preprocessed without it.