Abstract
This paper surveys the strategies that the Contrastive, Typological, and Translation Mining parallel corpus traditions rely on to deal with the issue of target language representativeness of translations. On the basis of a comparison of the corpus architectures and research designs of the three traditions, we argue that they have each developed their own representativeness strategies: (i) monolingual control corpora (Contrastive tradition), (ii) limits on the scope of research questions (Typological tradition), and (iii) parallel control corpora (Translation Mining tradition). We introduce normalized pointwise mutual information (NPMI) as a bi-directional measure of cross-linguistic association, allowing for an easy comparison of the outcomes of different traditions and the impact of the monolingual and parallel control corpus representativeness strategies. We further argue that corpus size has a major impact on the reliability of the monolingual control corpus strategy and that a sequential parallel control corpus strategy is preferable for smaller corpora.
Subject
Linguistics and Language,Language and Linguistics
Reference48 articles.
1. Adverbial Connectors in English and Swedish: Semantic and Lexical Correspondences;Altenberg;Language and Computers,1999
2. The English-Swedish Parallel Corpus: A resource for contrastive research and translation studies;Altenberg,2000
3. Semantic Typology and Parallel Corpora: Something about Indefinite Pronouns;Beekhuizen,2017
4. A Mandarin map for Dutch durativity*
5. The Discovery of Aspect: A Heuristic Parallel Corpus Study of Ingressive, Continuative and Resumptive Viewpoint Aspect
Cited by
26 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献