Abstract
AbstractMost natural language models and tools are restricted to one language, typically English. For researchers in the behavioral sciences investigating languages other than English, and for those researchers who would like to make cross-linguistic comparisons, hardly any computational linguistic tools exist, particularly none for those researchers who lack deep computational linguistic knowledge or programming skills. Yet, for interdisciplinary researchers in a variety of fields, ranging from psycholinguistics, social psychology, cognitive psychology, education, to literary studies, there certainly is a need for such a cross-linguistic tool. In the current paper, we present Lingualyzer (https://lingualyzer.com), an easily accessible tool that analyzes text at three different text levels (sentence, paragraph, document), which includes 351 multidimensional linguistic measures that are available in 41 different languages. This paper gives an overview of Lingualyzer, categorizes its hundreds of measures, demonstrates how it distinguishes itself from other text quantification tools, explains how it can be used, and provides validations. Lingualyzer is freely accessible for scientific purposes using an intuitive and easy-to-use interface.
Publisher
Springer Science and Business Media LLC
Subject
General Psychology,Psychology (miscellaneous),Arts and Humanities (miscellaneous),Developmental and Educational Psychology,Experimental and Cognitive Psychology
Reference113 articles.
1. Abney, D. H., Dale, R., Louwerse, M. M., & Kello, C. T. (2018). The bursts and lulls of multimodal interaction: Temporal distributions of behavior reveal differences between verbal and non-verbal communication. Cognitive Science, 42(4), 1297–1316. https://doi.org/10.1111/cogs.12612
2. Adelman, J. S., Brown, G. D., & Quesada, J. F. (2006). Contextual diversity, not word frequency, determines word-naming and lexical decision times. Psychological Science, 17(9), 814–823. https://doi.org/10.1111/j.1467-9280.2006.01787.x
3. Alvero, A., Giebel, S., Gebre-Medhin, B., Antonio, A. L., Stevens, M. L., & Domingue, B. W. (2021). Essay content and style are strongly related to household income and SAT scores: Evidence from 60,000 undergraduate applications. Science. Advances, 7(42). https://doi.org/10.1126/sciadv.abi9031
4. Artetxe, M., Aldabe, I., Agerri, R., Perez-De-Viñaspre, O., & Soroa, A. (2022). Does corpus quality really matter for low-resource languages? In Y. Goldberg, Z. Kozareva, & Y. Zhang, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 7383–7390). Association for Computational Linguistics.
5. Barbieri, F., & Saggion, H. (2014). Automatic detection of irony and humour in Twitter. In S. Colton, D. Ventura, N. Lavrac, & M. Cook, Proceedings of the Fifth International Conference on Computational Creativity (pp. 155–162). Association for Computational Creativity.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献