Affiliation:
1. Université de Strasbourg
2. Université catholique de Louvain
Abstract
Abstract
MWE knowledge is key in the process of learning a foreign language, but its teaching remains hindered by the lack
of list of expressions connected to pedagogical aims. In this paper, we present an extended version of the PolylexFLE database,
containing 4,525 French multiword expressions (MWE) of three types: idioms, collocations or fixed expressions. In order to propose
exercises following the difficulty scale of the European Framework of Reference for Languages (CEFR), we used a mixed approach
(manual and automatic) to annotate 1,186 expressions according to the CEFR levels. The paper focuses mostly on the automatic
procedure that first identifies the expressions from the PolylexFLE database (and their variants) in a corpus of pedagogical texts
(with CEFR labels) using a pattern-based system. In a second step, their distribution in this corpus is estimated and transformed
into a single CEFR level. The automatic approach proposed is finally evaluated by 52 French as foreign language learners.
Publisher
John Benjamins Publishing Company
Reference71 articles.
1. Interconnecting lexical resources and word alignment: How do learners get on with particle verbs?;Alfter,2019
2. Crowdsourcing Relative Rankings of Multi-Word Expressions: Experts versus Non-Experts;Alfter;Northern European Journal of Language Technology,2021
3. From distributions to labels: A lexical proficiency analysis using learner corpora;Alfter;Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition,2016
4. The ATILF-LLF System for Parseme Shared Task: a Transition-based
Verbal Multiword Expression Tagger
5. Substituto - A Synchronous Educational Language Game for Simultaneous Teaching and Crowdsourcing