Crosslingual and Multilingual Construction of Syntax-Based Vector Space Models-Reference-Cited by-同舟云学术

Crosslingual and Multilingual Construction of Syntax-Based Vector Space Models

Published:2014-12 Issue: Volume:2 Page:245-258
ISSN:2307-387X
Container-title:Transactions of the Association for Computational Linguistics
language:en
Short-container-title:TACL

Author:

Utt Jason¹,Padó Sebastian¹

Affiliation:

1. Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart,

Abstract

Syntax-based distributional models of lexical semantics provide a flexible and linguistically adequate representation of co-occurrence information. However, their construction requires large, accurately parsed corpora, which are unavailable for most languages. In this paper, we develop a number of methods to overcome this obstacle. We describe (a) a crosslingual approach that constructs a syntax-based model for a new language requiring only an English resource and a translation lexicon; and (b) multilingual approaches that combine crosslingual with monolingual information, subject to availability. We evaluate on two lexical semantic benchmarks in German and Croatian. We find that the models exhibit complementary profiles: crosslingual models yield higher accuracies while monolingual models provide better coverage. In addition, we show that simple multilingual models can successfully combine their strengths.

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Human-Computer Interaction,Communication

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/tacl_a_00180

Reference13 articles.

1. The WaCky wide web: a collection of very large linguistically processed web-crawled corpora

2. A Flexible, Corpus-Driven Model of Regular and Inverse Selectional Preferences

3. Automatic Labeling of Semantic Roles

4. Distributional Structure

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献