Using the Europarl corpus for cross-linguistic research-Reference-Cited by-同舟云学术

Using the Europarl corpus for cross-linguistic research

Published:2013-11-15 Issue: Volume:27 Page:23-42
ISSN:0774-5141
Container-title:Interference and normalization in genre-controlled multilingual corpora
language:en
Short-container-title:BJL

Author:

Cartoni Bruno,Zufferey Sandrine¹,Meyer Thomas

Affiliation:

1. Utrecht Institute of Linguistics OTS, Universiteit Utrecht

Abstract

Europarl is a large multilingual corpus containing the minutes of the debates at the European Parliament. This article presents a method to extract different corpora from Europarl: monolingual and multilingual comparable corpora, as well as parallel corpora. Using state-of-the-art measures of homogeneity, we show that these corpora are very similar. In addition, we argue that they present many advantages for research in various fields of linguistics and translation studies, and we also discuss some of their limitations. We conclude by reviewing a number of previous studies that made use of these corpora, emphasizing in each case the possibilities offered by Europarl.

Publisher

John Benjamins Publishing Company

Subject

Linguistics and Language,Language and Linguistics

Link

http://www.jbe-platform.com/deliver/fulltext/bjl.27.02car.pdf

Cited by 21 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Decoding French equivalents of the English present perfect: evidence from parallel corpora of parliamentary documents;Linguistics Vanguard;2024-07-02

2. Indirect translation and its influence on term variation;Indirect Translation and Sustainable Development;2023-07-27

3. Corpus Pragmatics;2023-02-24

4. Source language classification of indirect translations;Target. International Journal of Translation Studies;2022-04-11

5. The Politics of Person Reference;Pragmatics & Beyond New Series;2021-09-16