Author:
Mustakim ,Nurul Gayatri Indah Reza,Novita Rice,Kharisma Oktaf Brillian,Vebrianto Rian,Sanjaya Suwanto,Hasbullah ,Andriani Tuti,Sari Wardani Purnama,Novita Yulia,Rahim Robbi
Abstract
Abstract
Social media is one of the most common sources used to communicate, such as Twitter. Every tweet on Twitter contains data such as text which when collected can be processed into information. Data processed from Twitter tweet will create a trend which can be used for information such as in education, economics, politics, etc. This then created the concept of text mining. Text mining techniques are needed to find an interesting pattern in search of trends based on Twitter text with topics related to Pilkada Pekanbaru 2017. This research is intended to cluster Twitter text data using Density-Based Spatial Clustering of Application with Noise (DBSCAN) algorithm. This research was conducted with several experiments using different Eps and MinPts parameters for 2,184 text data which has been through several stages, such as cleaning, duplication removal, pre-processing like stemming and stopwords. Based on the highest average of Silhouette Index, Eps 0.1 and MinPts 10 with SI = 0.413 were chosen as paramaters, thus forming 31 clusters. According to the frequency of word occurrences in the cluster, the highest are “kpu”, followed by “firdaus”, “kota”, “pasang”, and “ayat”. As can be seen that the candidate pairs most often appear on cluster results are Firdaus-Ayat, and based on the results of Pilkada 2017, Firdaus-Ayat was chosen as Mayor and Vice Mayor of Pekanbaru.
Subject
General Physics and Astronomy
Reference23 articles.
1. KNN based machine learning approach for text and document mining;Bijalwan;Int. J. Database Theory Appl.,2014
2. Text Clustering Algorithms: A Review;Suyal;Int. J. Comput. Appl.,2014
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献