A tale of four parsers: methodological reflections on diagnostic evaluation and in-depth error analysis for meaning representation parsing

Author:

Buljan MajaORCID,Nivre Joakim,Oepen Stephan,Øvrelid Lilja

Abstract

AbstractWe discuss methodological choices in diagnostic evaluation and error analysis in meaning representation parsing (MRP), i.e. mapping from natural language utterances to graph-based encodings of semantic structure. We expand on a pilot quantitative study in contrastive diagnostic evaluation, inspired by earlier work in syntactic dependency parsing, and propose a novel methodology for qualitative error analysis. This two-pronged study is performed using a selection of submissions, data, and evaluation tools featured in the 2019 shared task on MRP. Our aim is to devise methods for identifying strengths and weaknesses in different broad families of parsing techniques, as well as investigating the relations between specific parsing approaches, different meaning representation frameworks, and individual linguistic phenomena—by identifying and comparing common error patterns. Our preliminary empirical results suggest that the proposed methodologies can be meaningfully applied to parsing into graph-structured target representations, as a side-effect uncovering hitherto unknown properties of the different systems that can inform future development and cross-fertilization across approaches.

Funder

University of Oslo

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Linguistics and Language,Education,Language and Linguistics

Reference35 articles.

1. Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Koehn, P., Palmer, M., & Schneider, N. (2013). Abstract Meaning Representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, Sofia (pp. 178–186). http://www.aclweb.org/anthology/W13-2322

2. Bender, E. M., Flickinger, D., Oepen, S., Packard, W., & Copestake, A. (2015). Layers of interpretation: On grammar and compositionality. In Proceedings of the 11th international conference on computational semantics (pp. 239–249).

3. Buchholz, S., & Marsi, E. (2006). CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the 10th conference on natural language learning, New York, NY (pp. 149–164). http://www.aclweb.org/anthology/W/W06/W06-2920

4. Buljan, M., Nivre, J., Oepen, S., & Øvrelid, L. (2020). A tale of three parsers: Towards diagnostic evaluation for meaning representation parsing. In Proceedings of the 12th language resources and evaluation conference (pp. 1902–1909).

5. Cai, S., & Knight, K. (2013). Smatch: An evaluation metric for semantic feature structures. In Proceedings of the 51th meeting of the Association for Computational Linguistics, Sofia (pp. 748–752). http://www.aclweb.org/anthology/P13-2131

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3