DBSCAN algorithm: twitter text clustering of trend topic pilkada pekanbaru

Author:

Mustakim ,Nurul Gayatri Indah Reza,Novita Rice,Kharisma Oktaf Brillian,Vebrianto Rian,Sanjaya Suwanto,Hasbullah ,Andriani Tuti,Sari Wardani Purnama,Novita Yulia,Rahim Robbi

Abstract

Abstract Social media is one of the most common sources used to communicate, such as Twitter. Every tweet on Twitter contains data such as text which when collected can be processed into information. Data processed from Twitter tweet will create a trend which can be used for information such as in education, economics, politics, etc. This then created the concept of text mining. Text mining techniques are needed to find an interesting pattern in search of trends based on Twitter text with topics related to Pilkada Pekanbaru 2017. This research is intended to cluster Twitter text data using Density-Based Spatial Clustering of Application with Noise (DBSCAN) algorithm. This research was conducted with several experiments using different Eps and MinPts parameters for 2,184 text data which has been through several stages, such as cleaning, duplication removal, pre-processing like stemming and stopwords. Based on the highest average of Silhouette Index, Eps 0.1 and MinPts 10 with SI = 0.413 were chosen as paramaters, thus forming 31 clusters. According to the frequency of word occurrences in the cluster, the highest are “kpu”, followed by “firdaus”, “kota”, “pasang”, and “ayat”. As can be seen that the candidate pairs most often appear on cluster results are Firdaus-Ayat, and based on the results of Pilkada 2017, Firdaus-Ayat was chosen as Mayor and Vice Mayor of Pekanbaru.

Publisher

IOP Publishing

Subject

General Physics and Astronomy

Reference23 articles.

1. KNN based machine learning approach for text and document mining;Bijalwan;Int. J. Database Theory Appl.,2014

2. Text Clustering Algorithms: A Review;Suyal;Int. J. Comput. Appl.,2014

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Text Clustering of Tafseer Translations by Using k-means Algorithm: An Al-Baqarah Chapter View;Annals of Emerging Technologies in Computing;2023-10-01

2. Mass-Suite: a novel open-source python package for high-resolution mass spectrometry data analysis;Journal of Cheminformatics;2023-09-23

3. Density Based Spatial Clustering of Applications with Noise and Sentence Bert Embedding for Indonesian Utterance Clustering;2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE);2023-02-16

4. Short text topic modelling approaches in the context of big data: taxonomy, survey, and analysis;Artificial Intelligence Review;2022-10-26

5. Implementation of Text Association Rules about Terrorism on Twitter in Indonesia;2022 10th International Conference on Cyber and IT Service Management (CITSM);2022-09-20

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3