Abstract
The ratio of the number Xn of different words (types) in a text of length n (token) words to n has received considerable attention in the literature of statistical linguistics. The present note contains two stochastic models for Xn based on an inhomogeneous discrete Markov process of the pure birth type where the transition probabilities take certain forms depending only upon n. These models are then tested against data obtained from the plays of William Shakespeare.
Publisher
Cambridge University Press (CUP)
Subject
Statistics, Probability and Uncertainty,General Mathematics,Statistics and Probability
Reference13 articles.
1. The Advanced Theory of Language as Choice and Chance
2. Gedanken zur automatischen Analyse von Normen und Normabweichungen;Müller;Muttersprache,1969
3. Kucera H. and Francis W. N. (1967) Computational Analysis of Present-Day American English. Providence, R. I.
4. ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS
Cited by
22 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Estimating Global Completeness of Event Logs: A Comparative Study;IEEE Transactions on Services Computing;2021-03-01
2. Linguistic laws in chimpanzee gestural communication;Proceedings of the Royal Society B: Biological Sciences;2019-02-13
3. Forms and Degrees of Repetition in Texts;QUANT LINGUIST;2015-02-13
4. Type-token models: a comparative study;Journal of Quantitative Linguistics;2014-12-17
5. General type-token distribution;Biometrika;2014-08-17