Extraction of Literary Character Information in Portuguese-Reference-Cited by-同舟云学术

Extraction of Literary Character Information in Portuguese

Published:2023-06-30 Issue:1 Volume:15 Page:31-40
ISSN:1647-0818
Container-title:Linguamática
language:
Short-container-title:Linguamática

Author:

Bick Eckhard

Abstract

This chapter describes PALAVRAS-DIP, a system for the automatic identification of characters and their social profiles in Portuguese and Brazilian literature. The system has been designed as an add-on module for a morphosyntactic and semantic parser. We tag human named entities (NE) for profession and social position, and use Constraint Grammar (CG relational tags to keep track of co-reference (e.g. pronoun anaphora, zero-subject verbs) and family reations between the characters. The resulting base annotation allows the extraction of character networks. The extraction program recognizes and bundles character name variants and distinguishes between names with a narrative function and simple cultural references. System development was motivated by DIP, a shared-task evaluation on 100 historical novels, where a prototype version achieved reasonable F-scores for character identification (63.4%) and alias resolution (68.1%), but underperformed for family relations (15.5%).

Publisher

University of Minho

Subject

Linguistics and Language,Language and Linguistics

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Avaliação no Desafio de Identificação de Personagens;Linguamática;2023-07-04

2. Pais, filhos e outras relações familiares no DIP;Linguamática;2023-07-04

3. DIP - Desafio de Identificação de Personagens: objectivo, organização, recursos e resultados;Linguamática;2023-07-04

4. Desafios e vantagens do processo de identificação automática do gênero e das profissões das personagens no DIP;Linguamática;2023-07-02