Let’s Play Mono-Poly: BERT Can Reveal Words’ Polysemy Level and Partitionability into Senses-Reference-Cited by-同舟云学术

Let’s Play Mono-Poly: BERT Can Reveal Words’ Polysemy Level and Partitionability into Senses

Published:2021 Issue: Volume:9 Page:825-844
ISSN:2307-387X
Container-title:Transactions of the Association for Computational Linguistics
language:en
Short-container-title:

Author:

Garí Soler Aina¹,Apidianaki Marianna²

Affiliation:

1. Université Paris-Saclay CNRS, LISN 91400, Orsay, France. aina.gari@limsi.fr

2. Department of Digital Humanities University of Helsinki Helsinki, Finland. marianna.apidianaki@helsinki.fi

Abstract

Pre-trained language models (LMs) encode rich information about linguistic structure but their knowledge about lexical polysemy remains unclear. We propose a novel experimental setup for analyzing this knowledge in LMs specifically trained for different languages (English, French, Spanish, and Greek) and in multilingual BERT. We perform our analysis on datasets carefully designed to reflect different sense distributions, and control for parameters that are highly correlated with polysemy such as frequency and grammatical category. We demonstrate that BERT-derived representations reflect words’ polysemy level and their partitionability into senses. Polysemy-related information is more clearly present in English BERT embeddings, but models in other languages also manage to establish relevant distinctions between words at different polysemy levels. Our results contribute to a better understanding of the knowledge encoded in contextualized representations and open up new avenues for multilingual lexical semantics research.

Publisher

MIT Press - Journals

Link

http://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl_a_00400/1955204/tacl_a_00400.pdf

Reference67 articles.

1. Clusterability: A Theoretical Study;Ackerman;Journal of Machine Learning Research,2009

2. Fine- grained analysis of sentence embeddings using auxiliary prediction tasks;Adi,2017

3. Unsupervised WSD based on automatically retrieved examples: The importance of bias;Agirre,2004

4. Putting words in context: LSTM language models and lexical ambiguity;Aina,2019

5. The WaCky Wide Web: A collection of very large linguistically processed web-crawled corpora;Baroni;Journal of Language Resources and Evaluation,2009

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Detecting Offensive Language on Malay Social Media: A Zero-Shot, Cross-Language Transfer Approach Using Dual-Branch mBERT;Applied Sciences;2024-07-02

2. Lexical Semantic Change through Large Language Models: a Survey;ACM Computing Surveys;2024-06-29

3. Training and evaluation of vector models for Galician;Language Resources and Evaluation;2024-06-04

4. Lost in Context? On the Sense-Wise Variance of Contextualized Word Embeddings;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024

5. The Impact of Word Splitting on the Semantic Content of Contextualized Word Representations;Transactions of the Association for Computational Linguistics;2024