Abstract
Written communication is pivotal for societies to develop. However, lexicon and depth of information vary greatly among texts according to their purpose. Scientific texts, diffusion of science reports, general and area-specific news are all written differently. Thus, we explore the characterization of different text categories through a nature-inspired feature known as the Hurst parameter. We contend that the Hurst exponent is useful to unveil the rhetorical structure within written documents. We collected and processed texts in five categories: scientific articles, diffusion of science reports, business news, entertainment news, and random texts. Each category contains 350 documents. We found that the median for scientific texts has the highest value of the Hurst parameter (0.575), followed by business news (0.54); the median for randomly-generated texts is 0.48, which lies in the region associated with random walks. The median value for diffusion texts is 0.49, and for entertainment texts is 0.53. However, these two categories present high dispersion. We conclude that the Hurst parameter is a measure that quantifies the structure of communication in the selected categories of texts. Application of our finding in the field of e-research is discussed.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference41 articles.
1. Linguistic structure emerges through the interaction of memory constratints and communicative pressures;Lewis;Behav. Brain Sci.,2019
2. Human Behaviour and the Principle of Least Effort;Zipf,1949
3. Einstein’s Gravitational Waves Detected in Landmark DiscoveryReuters.com
4. Observation of Gravitational Waves from a Binary Black Hole Merger
5. The importance of terminology in teaching K-12 science
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献