Affiliation:
1. The Queen's University of Belfast
Abstract
This paper presents results from a series of missing-word tests, in which a small fragment of text is presented to human subjects who are then asked to suggest a ranked list of completions. The same experiment is repeated with the WA model, an n-gram statistical language model. From the completion data two measures are obtained: (i) verbatim predictability, which indicates the extent to which subjects nominated exactly the missing word, and (ii) grammatical class predictability, which indicates the extent to which subjects nominated words of the same grammatical class as the missing word. The differences in language model performance and human performance are encouragingly small, especially for verbatim predictability. This is especially significant given that the WA model was able, on average, to use at most half the available context. The results highlight human superiority in handling missing content words. Most importantly, the experiments illustrate the detailed information one can obtain about the performance of a language model through using missing-word tests.
Subject
Speech and Hearing,Linguistics and Language,Sociology and Political Science,Language and Linguistics,General Medicine
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献