Spot the bot: large-scale natural language structure-Reference-Cited by-同舟云学术

Spot the bot: large-scale natural language structure

Published:2024 Issue: Volume: Page:281-312
ISSN:2619-0109
Container-title:Futurity designing. Digital reality problems
language:
Short-container-title:

Author:

Gromov Vasilii Aleksandrovich^ORCID,Borodin Nikita Sergeevich^ORCID,Kogan Alexandra Sergeevna^ORCID,Dang Quynh Nhu^ORCID,Yerbolova Asel Serikanovna^ORCID,Bayan Hendawi^ORCID

Abstract

In the modern world, specialized programs (bots) write comments, news, reviews, which may contain false information. As a result, it is extremely important to know whether a given text was written by a real person or a bot. This work aims to study the semantic trajectories of texts in natural languages to analyse the aforementioned problem. The study utilizes the concepts of vector embeddings and their n-grams, as well as methods for (1) clustering the semantic space, (2) analysing the position of texts on the 'entropy-complexity' plane, (3) estimating the intrinsic dimensionalities of vector language representations, and (4) topological data analysis.

Publisher

Keldysh Institute of Applied Mathematics

Reference51 articles.

1. Gromov VA, Migrina AM. A language as a self-organized critical system. Complexity 2017;2017:9212538. https://doi.org/10.1155/2017/9212538.

2. Garg, M., Gupta, A. K., & Prasad, R. (Eds.). (2022). Graph Learning and Network Science for Natural Language Processing. CRC Press.

3. Garg, M., Kumar, M., & Samanta, D. (2023). Towards Pattern Recognition with Network Science and Natural Language Processing for Information Retrieval.

4. Garg, M., & Kumar, M. (2018). The structure of word co-occurrence network for microblogs. Physica A: Statistical Mechanics and its Applications, 512, 698-720.

5. Markovič R, Gosak R, Perc M, Marhl M, Grubelnik V. Applying network theory to fables: complexity in slovene belles-lettres for different age groups. Complex Networks 2018;7:114-127. https://doi.org/10.1093/comnet%2Fcny018.