HILDA: A Discourse Parser Using Support Vector Machine Classification-Reference-Cited by-同舟云学术

HILDA: A Discourse Parser Using Support Vector Machine Classification

Published:2010-12-10 Issue:3 Volume:1 Page:1-33
ISSN:2152-9620
Container-title:Dialogue & Discourse
language:
Short-container-title:dad

Author:

Hernault Hugo,Prendinger Helmut,Verle David A. du,Ishizuka Mitsuru

Abstract

Discourse structures have a central role in several computational tasks, such as question-answering or dialogue generation. In particular, the framework of the Rhetorical Structure Theory (RST) offers a sound formalism for hierarchical text organization. In this article, we present HILDA, an implemented discourse parser based on RST and Support Vector Machine (SVM) classification. SVM classifiers are trained and applied to discourse segmentation and relation labeling. By combining labeling with a greedy bottom-up tree building approach, we are able to create accurate discourse trees in linear time complexity. Importantly, our parser can parse entire texts, whereas the publicly available parser SPADE (Soricut and Marcu 2003) is limited to sentence level analysis. HILDA outperforms other discourse parsers for tree structure construction and discourse relation labeling. For the discourse parsing task, our system reaches 78.3% of the performance level of human annotators. Compared to a state-of-the-art rule-based discourse parser, our system achieves a performance increase of 11.6%.

Publisher

University of Illinois Libraries

Subject

Linguistics and Language,Communication,Language and Linguistics

Cited by 82 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Chinese-DiMLex: a lexicon of Chinese discourse connectives;Language Resources and Evaluation;2024-08-18

2. New Text Classification Strategy Based on a Word Embedding and Noise-Words Removal;2023 24th International Arab Conference on Information Technology (ACIT);2023-12-06

3. Topic-Aware Two-Layer Context-Enhanced Model for Chinese Discourse Parsing;Communications in Computer and Information Science;2023-11-27

4. Top-down Text-Level Discourse Rhetorical Structure Parsing with Bidirectional Representation Learning;Journal of Computer Science and Technology;2023-09

5. Research Materials and Methods;Dependency Structures from Syntax to Discourse;2023-08-09