Affiliation:
1. Department of Chinese Language and Literature, Tsinghua University, Beijing, China
2. School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
Abstract
Stylistic analysis enables open-ended and exploratory observation of languages. To fill the gap in the quantitative analysis of the stylistic systems of Middle Chinese, we construct lexical features based on the evolutive core word usage and scheme a Bayesian method for feature parameters estimation. The lexical features are from the Swadesh list, each of which has different word forms along with the language evolution during the Middle Ages. We thus count the varied word of those entries along with the language evolution as the linguistic features. With the Bayesian formulation, the feature parameters are estimated to construct a high-dimensional random feature vector to obtain the pair-wise dissimilarity matrix of all the texts based on different distance measures. Finally, we perform the spectral embedding and clustering to visualize, categorize, and analyze the linguistic styles of Middle Chinese texts. The quantitative result agrees with the existing qualitative conclusions and, furthermore, betters our understanding of the linguistic styles of Middle Chinese from both the inter-category and intra-category aspects. It also helps unveil the special styles induced by the indirect language contact.
Funder
National Social Science Fund of China
National Youth Talent Support Program of China
Publisher
Association for Computing Machinery (ACM)
Reference45 articles.
1. Purushottam Vishvanath Bapat (Ed.). 1956. 2500 years of Buddhism. New Delhi: Ministry of Information & Broadcasting.
2. An examination of “I” and “Myself” in Lun Heng, theory of practice, the family instructions of Master Yan and one hundred Buddhist parables;Cao Meng;J. Tongren Univ.,2017
3. Improving customer complaint management by automatic email classification using linguistic style features as predictors
4. Hugh Craig. 2004. Stylistic analysis and authorship studies. In A Companion to Digital Humanities, Susan Schreibman, Ray Siemens, and John Unsworth (Eds.). Blackwell Publishing Ltd, Malden, MA, 273–288.
5. Written versus spoken queries: A qualitative and quantitative comparative analysis