Dependency parsing of learner English-Reference-Cited by-同舟云学术

Dependency parsing of learner English

Published:2018-05-31 Issue:1 Volume:23 Page:28-54
ISSN:1384-6655
Container-title:International Journal of Corpus Linguistics
language:en
Short-container-title:IJCL

Author:

Huang Yan¹,Murakami Akira²,Alexopoulou Theodora¹,Korhonen Anna¹

Affiliation:

1. University of Cambridge

2. University of Tübingen

Abstract

Abstract Current syntactic annotation of large-scale learner corpora mainly resorts to “standard parsers” trained on native language data. Understanding how these parsers perform on learner data is important for downstream research and application related to learner language. This study evaluates the performance of multiple standard probabilistic parsers on learner English. Our contributions are three-fold. Firstly, we demonstrate that the common practice of constructing a gold standard – by manually correcting the pre-annotation of a single parser – can introduce bias to parser evaluation. We propose an alternative annotation method which can control for the annotation bias. Secondly, we quantify the influence of learner errors on parsing errors, and identify the learner errors that impact on parsing most. Finally, we compare the performance of the parsers on learner English and native English. Our results have useful implications on how to select a standard parser for learner English.

Publisher

John Benjamins Publishing Company

Subject

Linguistics and Language,Language and Linguistics

Link

http://www.jbe-platform.com/deliver/fulltext/ijcl.16080.hua.pdf

Reference31 articles.

1. Anchoring and Agreement in Syntactic Annotations

2. Universal Dependencies for Learner English

3. CoNLL-X shared task on multilingual dependency parsing

4. Parsing to Stanford dependencies: Trade-offs between speed and accuracy;Cer,2010

5. Coarse-to-fine n-best parsing and MaxEnt discriminative reranking;Charniak,2005

Cited by 35 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Coding all clauses in L2 data: A call for consistency;Research Methods in Applied Linguistics;2024-12

2. Analysis of verb argument constructions (VACs) in L2 learners across proficiency levels: A corpus-based study in L1 Indonesian;Applied Corpus Linguistics;2024-12

3. Evaluating NLP models with written and spoken L2 samples;Research Methods in Applied Linguistics;2024-08

4. The potential influence of cross-linguistic lexical similarity on lexical diversity in L2 English writing;Corpora;2024-08

5. Utility of Kolmogorov complexity measures: Analysis of L2 groups and L1 backgrounds;PLOS ONE;2024-04-18