Unlexicalized Transition-based Discontinuous Constituency Parsing-Reference-Cited by-同舟云学术

Unlexicalized Transition-based Discontinuous Constituency Parsing

Published:2019-04-01 Issue: Volume:7 Page:73-89
ISSN:2307-387X
Container-title:Transactions of the Association for Computational Linguistics
language:en
Short-container-title:

Author:

Coavoux Maximin¹,Crabbé Benoît²³,Cohen Shay B.⁴

Affiliation:

1. University of Edinburgh, ILCC, United Kingdom. mcoavoux@inf.ed.ac.uk

2. Université Paris Diderot, Université Sorbonne Paris Cité, LLF, France

3. Institut Universitaire de France (IUF), France. bcrabbe@linguist.univ-paris-diderot.fr

4. University of Edinburgh, ILCC, United Kingdom. scohen@inf.ed.ac.uk

Abstract

Abstract Lexicalized parsing models are based on the assumptions that (i) constituents are organized around a lexical head and (ii) bilexical statistics are crucial to solve ambiguities. In this paper, we introduce an unlexicalized transition-based parser for discontinuous constituency structures, based on a structure-label transition system and a bi-LSTM scoring system. We compare it with lexicalized parsing models in order to address the question of lexicalization in the context of discontinuous constituency parsing. Our experiments show that unlexicalized models systematically achieve higher results than lexicalized models, and provide additional empirical evidence that lexicalization is not necessary to achieve strong parsing results. Our best unlexicalized model sets a new state of the art on English and German discontinuous constituency treebanks. We further provide a per-phenomenon analysis of its errors on discontinuous constituents.

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Human-Computer Interaction,Communication

Link

http://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl_a_00255/1923183/tacl_a_00255.pdf

Reference61 articles.

1. Daniel M. Bikel . 2004. A distributional analysis of a lexicalized statistical parsing model. In Proceedings of EMNLP 2004, pages 182–189. Association for Computational Linguistics.

2. Anders Björkelund , OzlemCetinoglu, RichárdFarkas, ThomasMueller, and WolfgangSeeker.2013. (Re)ranking meets morphosyntax: State- of-the-art results from the SPMRL 2013 shared task. In Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically- Rich Languages, pages 135–145, Seattle, Washington, USA. Association for Computational Linguistics.

3. Léon Bottou . 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT’2010), pages 177–187, Paris, France. Springer.

4. Sabine Brants , StefanieDipper, SilviaHansen, WolfgangLezius, and GeorgeSmith.2002. Tiger treebank. In Proceedings of the Workshop on Treebanks and Linguistic Theories, September 20-21 (TLT02). Sozopol, Bulgaria.

5. Eugene Charniak . 1997. Statistical parsing with a context-free grammar and word statistics. In Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Conference on Innovative Applications of Artificial Intelligence, AAAI’97/IAAI’97, pages 598–603. AAAI Press.

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Discontinuous grammar as a foreign language;Neurocomputing;2023-03

2. Discontinuous Combinatory Constituency Parsing;Transactions of the Association for Computational Linguistics;2023

3. Multitask Pointer Network for multi-representational parsing;Knowledge-Based Systems;2022-01

4. A survey of syntactic-semantic parsing based on constituent and dependency structures;Science China Technological Sciences;2020-09-16