TeamTat: a collaborative text annotation tool-Reference-Cited by-同舟云学术

TeamTat: a collaborative text annotation tool

Published:2020-05-08 Issue:W1 Volume:48 Page:W5-W11
ISSN:0305-1048
Container-title:Nucleic Acids Research
language:en
Short-container-title:

Author:

Islamaj Rezarta¹^ORCID,Kwon Dongseop²,Kim Sun¹^ORCID,Lu Zhiyong¹^ORCID

Affiliation:

1. National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD 20894, USA

2. School of Software Convergence, Myongji University, Seoul 03674, South Korea

Abstract

Abstract Manually annotated data is key to developing text-mining and information-extraction algorithms. However, human annotation requires considerable time, effort and expertise. Given the rapid growth of biomedical literature, it is paramount to build tools that facilitate speed and maintain expert quality. While existing text annotation tools may provide user-friendly interfaces to domain experts, limited support is available for figure display, project management, and multi-user team annotation. In response, we developed TeamTat (https://www.teamtat.org), a web-based annotation tool (local setup available), equipped to manage team annotation projects engagingly and efficiently. TeamTat is a novel tool for managing multi-user, multi-label document annotation, reflecting the entire production life cycle. Project managers can specify annotation schema for entities and relations and select annotator(s) and distribute documents anonymously to prevent bias. Document input format can be plain text, PDF or BioC (uploaded locally or automatically retrieved from PubMed/PMC), and output format is BioC with inline annotations. TeamTat displays figures from the full text for the annotator's convenience. Multiple users can work on the same document independently in their workspaces, and the team manager can track task completion. TeamTat provides corpus quality assessment via inter-annotator agreement statistics, and a user-friendly interface convenient for annotation review and inter-annotator disagreement resolution to improve corpus quality.

Funder

National Institutes of Health

Ministry of Science and ICT

Ministry of Education

Publisher

Oxford University Press (OUP)

Subject

Genetics

Link

http://academic.oup.com/nar/article-pdf/48/W1/W5/33433452/gkaa333.pdf

Reference32 articles.

1. The BioC-BioGRID corpus: full text articles annotated for curation of protein-protein and genetic interactions;Islamaj Dogan;Database,2017