Text: now in 2D! A framework for lexical expansion with contextual similarity-Reference-Cited by-同舟云学术

Text: now in 2D! A framework for lexical expansion with contextual similarity

Published:2013-07-22 Issue:1 Volume:1 Page:
ISSN:2299-8470
Container-title:Journal of Language Modelling
language:
Short-container-title:JLM

Author:

Biemann Chris,Riedl Martin

Abstract

A new metaphor of two-dimensional text for data-driven semantic modeling of natural language is proposed, which provides an entirely new angle on the representation of text: not only syntagmatic relations are annotated in the text, but also paradigmatic relations are made explicit by generating lexical expansions. We operationalize distributional similarity in a general framework for large corpora, and describe a new method to generate similar terms in context. Our evaluation shows that distributional similarity is able to produce highquality lexical resources in an unsupervised and knowledge-free way, and that our highly scalable similarity measure yields better scores in a WordNet-based evaluation than previous measures for very large corpora. Evaluating on a lexical substitution task, we find that our contextualization method improves over a non-contextualized baseline across all parts of speech, and we show how the metaphor can be applied successfully to part-of-speech tagging. A number of ways to extend and improve the contextualization method within our framework are discussed. As opposed to comparable approaches, our framework defines a model of lexical expansions in context that can generate the expansions as opposed to ranking a given list, and thus does not require existing lexical-semantic resources.

Publisher

Institute of Computer Science, Polish Academy of Sciences

Subject

Computer Science Applications,Linguistics and Language,Modelling and Simulation

Cited by 32 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unsupervised Ultra-Fine Entity Typing with Distributionally Induced Word Senses;Lecture Notes in Computer Science;2024

2. Text augmentation for semantic frame induction and parsing;Language Resources and Evaluation;2023-10-21

3. Using distributional thesaurus to enhance transformer-based contextualized representations for low resource languages;Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing;2022-04-25

4. Hypernymy Detection for Low-resource Languages: A Study for Hindi, Bengali, and Amharic;ACM Transactions on Asian and Low-Resource Language Information Processing;2022-03-04

5. Network embeddings from distributional thesauri for improving static word representations;Expert Systems with Applications;2022-01