Tagging terms in text-Reference-Cited by-同舟云学术

Tagging terms in text

Published:2022-01-10 Issue:1 Volume:28 Page:157-189
ISSN:0929-9971
Container-title:Terminology
language:en
Short-container-title:TERM

Author:

Rigouts Terryn Ayla¹^ORCID,Hoste Véronique¹,Lefever Els¹

Affiliation:

1. Ghent University

Abstract

Abstract As with many tasks in natural language processing, automatic term extraction (ATE) is increasingly approached as a machine learning problem. So far, most machine learning approaches to ATE broadly follow the traditional hybrid methodology, by first extracting a list of unique candidate terms, and classifying these candidates based on the predicted probability that they are valid terms. However, with the rise of neural networks and word embeddings, the next development in ATE might be towards sequential approaches, i.e., classifying each occurrence of each token within its original context. To test the validity of such approaches for ATE, two sequential methodologies were developed, evaluated, and compared: one feature-based conditional random fields classifier and one embedding-based recurrent neural network. An additional comparison was added with a machine learning interpretation of the traditional approach. All systems were trained and evaluated on identical data in multiple languages and domains to identify their respective strengths and weaknesses. The sequential methodologies were proven to be valid approaches to ATE, and the neural network even outperformed the more traditional approach. Interestingly, a combination of multiple approaches can outperform all of them separately, showing new ways to push the state-of-the-art in ATE.

Publisher

John Benjamins Publishing Company

Subject

Library and Information Sciences,Communication,Language and Linguistics

Link

http://www.jbe-platform.com/deliver/fulltext/term.21010.rig.pdf

Reference62 articles.

1. JW300: A Wide-Coverage Parallel Corpus for Low-Resource Languages

2. FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP;Akbik,2019

3. Contextual String Embeddings for Sequence Labeling;Akbik,2018

4. Automatic keyphrase extraction: a survey and trends

5. Local-Global Vectors to Improve Unigram Terminology Extraction;Amjadian,2016

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Computational Terminology;New Frontiers in Translation Studies;2024

2. CoastTerm: A Corpus for Multidisciplinary Term Extraction in Coastal Scientific Literature;Lecture Notes in Computer Science;2024

3. A systematic review of Automatic Term Extraction: What happened in 2022?;Digital Scholarship in the Humanities;2023-06-01

4. Approaches, tools, algorithms, and methods for automatic term extraction: A systematic literature mapping;2023-01-13

5. Financial concepts extraction and lexical simplification in Spanish;RAEL-REV ELECTR LING;2023