Affiliation:
1. 1 Institute of Polish Language, Polish Academy of Sciences , Warsaw , Poland
Abstract
Abstract
This paper discusses the application of standard keyword extraction methods from corpus linguistics for the study of old Polish language. The unfolding analysis is based on writings included in the Electronic Corpus of 17th- and 18th-century Polish Texts. The aim of this analysis is to select keywords from over two million tokens derived from texts tagged as religion in the corpus and compare them with the reference corpus containing over nine million tokens, while verifying the applicability of the log-likelihood method for the analysis of old Polish language and developing a part of the research model.
Subject
Linguistics and Language,Language and Linguistics,Linguistics and Language,Language and Linguistics
Reference14 articles.
1. Adamiec, D. (2015). Kryteria doboru tekstów do ‘Elektronicznego korpusu tekstów polskich z XVII i XVIII w. (do 1772 r.)’. Prace Filologiczne, 67, pages 11–20.
2. Electronic Corpus of 17th- and 18th-century Polish Texts (until 1772). Accessible at: https://korba.edu.pl/korba1.
3. Gruszczyński, W., Adamiec, D., Bronikowska, R., Kieraś, W., Modrzejewski, E., Wieczorek, A., and Woliński, M. (2022). The Electronic Corpus of 17th - and 18th-century Polish Texts. Language Resources and Evaluation, 56, pages 309–332.
4. Kieraś, W., and Zawadzka-Paluektau, N. (2023). Słowa klucze polskiego dyskursu politycznego na przestrzeni ostatnich stu lat: analiza korpusu exposé premierów (Manuscript submitted for publication).
5. Majdak, M. (2016). Słowa klucze w materiale historycznym – wyzwania i ograniczenia. Przegląd Humanistyczny, 3, pages 45–56.