A tale of four parsers: methodological reflections on diagnostic evaluation and in-depth error analysis for meaning representation parsing-Reference-Cited by-同舟云学术

A tale of four parsers: methodological reflections on diagnostic evaluation and in-depth error analysis for meaning representation parsing

Published:2022-05-17 Issue:4 Volume:56 Page:1075-1102
ISSN:1574-020X
Container-title:Language Resources and Evaluation
language:en
Short-container-title:Lang Resources & Evaluation

Author:

Buljan Maja^ORCID,Nivre Joakim,Oepen Stephan,Øvrelid Lilja

Abstract

AbstractWe discuss methodological choices in diagnostic evaluation and error analysis in meaning representation parsing (MRP), i.e. mapping from natural language utterances to graph-based encodings of semantic structure. We expand on a pilot quantitative study in contrastive diagnostic evaluation, inspired by earlier work in syntactic dependency parsing, and propose a novel methodology for qualitative error analysis. This two-pronged study is performed using a selection of submissions, data, and evaluation tools featured in the 2019 shared task on MRP. Our aim is to devise methods for identifying strengths and weaknesses in different broad families of parsing techniques, as well as investigating the relations between specific parsing approaches, different meaning representation frameworks, and individual linguistic phenomena—by identifying and comparing common error patterns. Our preliminary empirical results suggest that the proposed methodologies can be meaningfully applied to parsing into graph-structured target representations, as a side-effect uncovering hitherto unknown properties of the different systems that can inform future development and cross-fertilization across approaches.

Funder

University of Oslo

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Linguistics and Language,Education,Language and Linguistics

Link

https://link.springer.com/content/pdf/10.1007/s10579-022-09591-7.pdf

Reference35 articles.

1. Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Koehn, P., Palmer, M., & Schneider, N. (2013). Abstract Meaning Representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, Sofia (pp. 178–186). http://www.aclweb.org/anthology/W13-2322

2. Bender, E. M., Flickinger, D., Oepen, S., Packard, W., & Copestake, A. (2015). Layers of interpretation: On grammar and compositionality. In Proceedings of the 11th international conference on computational semantics (pp. 239–249).

3. Buchholz, S., & Marsi, E. (2006). CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the 10th conference on natural language learning, New York, NY (pp. 149–164). http://www.aclweb.org/anthology/W/W06/W06-2920

4. Buljan, M., Nivre, J., Oepen, S., & Øvrelid, L. (2020). A tale of three parsers: Towards diagnostic evaluation for meaning representation parsing. In Proceedings of the 12th language resources and evaluation conference (pp. 1902–1909).

5. Cai, S., & Knight, K. (2013). Smatch: An evaluation metric for semantic feature structures. In Proceedings of the 51th meeting of the Association for Computational Linguistics, Sofia (pp. 748–752). http://www.aclweb.org/anthology/P13-2131