Abstract
In the modern world, specialized programs (bots) write comments, news, reviews, which may contain false information. As a result, it is extremely important to know whether a given text was written by a real person or a bot. This work aims to study the semantic trajectories of texts in natural languages to analyse the aforementioned problem. The study utilizes the concepts of vector embeddings and their n-grams, as well as methods for (1) clustering the semantic space, (2) analysing the position of texts on the 'entropy-complexity' plane, (3) estimating the intrinsic dimensionalities of vector language representations, and (4) topological data analysis.
Publisher
Keldysh Institute of Applied Mathematics