Albanian Fake News Detection-Reference-Cited by-同舟云学术

Albanian Fake News Detection

Published:2022-09-30 Issue:5 Volume:21 Page:1-24
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Canhasi Ercan¹,Shijaku Rexhep¹,Berisha Erblin¹

Affiliation:

1. Universty of Prizren, Prizren

Abstract

Recent years have witnessed the vast increase of the phenomenon known as the fake news. Among the main reasons for this increase are the continuous growth of internet and social media usage and the real-time information dissemination opportunity offered by them. Deceiving, misleading content, such as the fake news, especially the type made by and for social media users, is becoming eminently hazardous. Hence, the fake news detection problem has become an important research topic. Despite the recent advances in fake news detection, the lack of fake news corpora for the under-resourced languages is compromising the development and the evaluation of existing approaches in these languages. To fill this huge gap, in this article, we investigate the issue of fake news detection for the Albanian language. In it, we present a new public dataset of labeled true and fake news in Albanian and perform an extensive analysis of machine learning methods for fake news detection. We performed a comprehensive feature engineering and feature selection experiments. In doing so, we explored the Albanian language-related feature categories such as the lexical, syntactic, lying-detection, and psycho-linguistic features. Each article was also modeled in four different ways: with the traditional bag-of-words (BoW) and with three distributed text representations using the state-of-the-art Word2Vec, FastText, and BERT methods. Additionally, we investigated the best combination of features and various types of classification methods. The conducted experiments and obtained results from evaluations are finally used to draw some conclusions. They shed light on the potentiality of the methods and the challenges that the Albanian fake news detection presents.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3487288

Reference70 articles.

1. An introduction to kernel and nearest-neighbor nonparametric regression;Altman Naomi S.;Amer. Statist.,1992

2. Enriching Word Vectors with Subword Information

3. A survey on fake news and rumour detection techniques

4. A training algorithm for optimal margin classifiers

Cited by 18 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Harnessing heterogeneity: A multi-embedding ensemble approach for detecting fake news in Dravidian languages;Computers and Electrical Engineering;2024-12

2. Authorship Attribution in Less-Resourced Languages: A Hybrid Transformer Approach for Romanian;Applied Sciences;2024-03-23

3. Ensemble Classifier for Hindi Hostile Content Detection;ACM Transactions on Asian and Low-Resource Language Information Processing;2024-01-15

4. Identification of Misinformation Using Word Embedding Technique Word2Vec, Machine Learning, and Deep Learning Models;Lecture Notes in Networks and Systems;2024

5. Detection of Fake News Using Machine Intelligence for Societal Benefit;Lecture Notes in Networks and Systems;2024