Author:
de Novais Eder Miranda,Paraboni Ivandré
Abstract
Abstract
As in many other natural language processing (NLP) fields, the use of statistical methods is now part of mainstream natural language generation (NLG). In the development of systems of this kind, however, there is the issue of data sparseness, a problem that is particularly evident in the case of morphologically-rich languages such as Portuguese. This work presents a shallow surface realisation system that makes use of factored language models (FLMs) of Portuguese to overcome some of these difficulties. The system combines FLMs trained on a large corpus with a number of NLP resources that have been made publicly available by the Brazilian NLP research community in recent years, such as corpora, dictionaries, thesauri and others. Our FLM-based approach to surface realisation has been successfully applied to the generation of Brazilian newspapers headlines, and the results are shown to outperform a number of statistical and non-statistical baseline systems alike.
Publisher
Springer Science and Business Media LLC
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献