Affiliation:
1. Faculty of Human and Social Studies Mykolas Romeris University 20 Ateities St . Vilnius , Lithuania
2. Institute of Digital Resources and Interdisciplinary Research Vytautas Magnus University 23-216 V. Putvinskio St . Kaunas , Lithuania
3. Faculty of Public Governance and Business Mykolas Romeris University 20 Ateities St . Vilnius , Lithuania
Abstract
Abstract
The aim of the current research is to investigate the feasibility of identifying offensive language in Lithuanian by utilising the Simplified Offensive Language Taxonomy (SOLT). The key principle behind this taxonomy is its ability to complement existing offensive language ontologies and tagset systems, with the ultimate goal of integrating it into publicly accessible Linguistic Linked Open Data (LLOD) resources. The dataset used in the current study is a publicly available corpus of user-generated comments collected from a Lithuanian portal (Amilevičius et al. 2016). The study identified that offensive language predominantly focuses on collective derogatory language rather than individuals. The most common category of offensive language is related to physical and mental disabilities, followed by ideological offenses, xenophobic and sexist remarks, and less frequent categories like ageism, classism, homophobia, and religious discrimination. These results highlight the diverse range of offensive language online and underscore the need to combat discrimination and promote respectful discourse, particularly concerning marginalised groups.
Subject
Linguistics and Language,Communication,Language and Linguistics