Improving mention detection for Basque based on a deep error analysis-Reference-Cited by-同舟云学术

Improving mention detection for Basque based on a deep error analysis

Published:2016-07-12 Issue:3 Volume:23 Page:351-384
ISSN:1351-3249
Container-title:Natural Language Engineering
language:en
Short-container-title:Nat. Lang. Eng.

Author:

SORALUZE ANDER,ARREGI OLATZ,ARREGI XABIER,DÍAZ DE ILARRAZA ARANTZA

Abstract

AbstractThis paper presents the improvement process of a mention detector for Basque. The system is rule-based and takes into account the characteristics of mentions in Basque. A classification of error types is proposed based on the errors that occur during mention detection. A deep error analysis distinguishing error types and causes is presented and improvements are proposed. At the final stage, the system obtains an F-measure of 74.57% under the Exact Matching protocol and of 80.57% under Lenient Matching. We also show the performance of the mention detector with gold standard data as input, in order to omit errors caused by the previous stages of linguistic processing. In this scenario, we obtain an F-measure of 85.89% with Strict Matching and of 89.06% with Lenient Matching, i.e., a difference of 11.32 and 8.49 percentage points, respectively. Finally, how improvements in mention detection affect coreference resolution is analysed.

Publisher

Cambridge University Press (CUP)

Subject

Artificial Intelligence,Linguistics and Language,Language and Linguistics,Software

Reference51 articles.

1. Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing;Aduriz;Language and Computers,2006

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Deep Learning Based Algorithm for Detecting Errors in Mandarin Read-Aloud Backreading Omission Incremental Reading;Applied Mathematics and Nonlinear Sciences;2024-01-01

2. EusTimeML: A mark-up language for temporal information in Basque;Research in Corpus Linguistics;2020

3. EUSKOR: End-to-end coreference resolution system for Basque;PLOS ONE;2019-09-12