Lexicon-Based Indonesian Local Language Abusive Words Dictionary to Detect Hate Speech in Social Media-Reference-Cited by-同舟云学术

Lexicon-Based Indonesian Local Language Abusive Words Dictionary to Detect Hate Speech in Social Media

Published:2020-04-27 Issue:1 Volume:6 Page:9
ISSN:2443-2555
Container-title:Journal of Information Systems Engineering and Business Intelligence
language:
Short-container-title:JISEBI

Author:

Hayaty Mardhiya,Adi Sumarni,Hartanto Anggit Dwi

Abstract

Background: Hate speech is an expression to someone or a group of people that contain feelings of hate and/or anger at people or groups. On social media users are free to express themselves by writing harsh words and share them with a group of people so that it triggers separations and conflicts between groups. Currently, research has been conducted by several experts to detect hate speech in social media namely machine learning-based and lexicon-based, but the machine learning approach has a weakness namely the manual labelling process by an annotator in separating positive, negative or neutral opinions takes time long and tiringObjective: This study aims to produce a dictionary containing abusive words from local languages in Indonesia. Lexicon-base is very dependent on the language contained in dictionary words. Indonesia has thousands of tribes with 2500 local languages, and 80% of the population of Indonesia use local languages in communication, with the result that a significant challenge to detect hate speech of social media.Methods: Abusive words surveys are conducted by using proportionate stratified random sampling techniques in 4 major tribes on the island of Java, namely Betawi, Sundanese, Javanese, MadureseResults: The experimental results produce 250 abusive words dictionary from 4 major Indonesian tribes to detect hate speech in Indonesian social media by using the lexicon-based approach. Conclusion: A stratified random sampling technique has been conducted in 4 major Indonesian tribes to produce 250 abusive words for hate speech detection using the lexicon-based approach.

Publisher

Universitas Airlangga

Reference30 articles.

1. Z. Al and M. Amr, "Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach," Computing, no. 0123456789, 2019.

2. M. Makrehchi, "The correlation between language shift and social conflicts in polarized social media," Proc. - 2014 IEEE/WIC/ACM Int. Jt. Conf. Web Intell. Intell. Agent Technol. - Work. WI-IAT 2014, vol. 2, pp. 169-194, 2014.

3. Y. Rao, J. Lei, L. Wenyin, Q. Li, and M. Chen, "Building emotional dictionary for sentiment analysis of online news," World Wide Web Internet Web Inf. Syst., pp. 723-742, 2014.

4. A. P. J. I. Indonesia, "Buletin APJII Edisi 22 Maret 2018," Bul. APJII, 2018.

5. W. Medhat, A. Hassan, and H. Korashy, "Sentiment analysis algorithms and applications: A survey," Ain Shams Eng. J., vol. 5, no. 4, pp. 1093-1113, Dec. 2014.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploring the Impact of Lexicon-based Knowledge Transfer for Hate Speech Detection in Indonesia Code-Mixed Languages;Proceedings of the 2023 7th International Conference on Natural Language Processing and Information Retrieval;2023-12-15

2. Are There Hate Speeches on Spanish Television?;Advances in Media, Entertainment, and the Arts;2023-06-30

3. Exploring Automatic Hate Speech Detection on Social Media: A Focus on Content-Based Analysis;SAGE Open;2023-04