Affiliation:
1. Leiden University, the Netherlands
2. University of Amsterdam, the Netherlands
Abstract
In this paper we present three design principles of language – experience, heterogeneity and redundancy – and present recent developments in a family of models incorporating them, namely Data-Oriented Parsing/Unsupervised Data-Oriented Parsing. Although the idea of some form of redundant storage has become part and parcel of parsing technologies and usage-based linguistic approaches alike, the question how much of it is cognitively realistic and/or computationally optimally efficient is an open one. We argue that a segmentation-based approach (Bayesian Model Merging) combined with an all-subtrees approach reduces the number of rules needed to achieve an optimal performance, thus making the parser more efficient. At the same time, starting from unsegmented wholes comes closer to the acquisitional situation of a language learner, and thus adds to the cognitive plausibility of the model.
Subject
Speech and Hearing,Linguistics and Language,Sociology and Political Science,Language and Linguistics,General Medicine
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献