Approaches, tools, algorithms, and methods for automatic term extraction: A systematic literature mapping

Author:

Andrade Juan Carlos Blandón1,Otálvaro Carlos Mario Medina1,Jaramillo Carlos Mario Zapata2,Ríos Alejandro Morales1

Affiliation:

1. Universidad Católica de Pereira

2. Universidad Nacional de Colombia

Abstract

Abstract Automatic term extraction is a branch of Natural Language Processing (NLP) used to automatically generate lexicographic materials, such as glossaries, vocabularies, and dictionaries. It allows the creation of standard bases for building unified theories and translations between languages. Scientific literature shows great interest in the construction of automatic term extractors and includes several approaches, tools, algorithms, and methods that can be used for their construction; however, the number of articles in specialized databases is vast, and literature reviews are not recent. This paper presents a systematic literature mapping of the existing material for developing automatic term extractors to provide an overview of approaches, tools, algorithms, and methods used to create them. For this purpose, scientific articles in the domain published between 2015 and 2022 are reviewed and categorized. The mapping results show that among the most used approaches are statistical, with 21.85%; linguistic, with 9.75%; and hybrid, with 68.29%. In addition, there are various computational tools for terminology extraction where authors use different methods for their construction and whose results are measured under the criteria of precision and recall. Finally, 113 documents were selected to answer the research questions and to demonstrate how automatic term extractors are constructed. This paper presents a global summary of primary studies as an essential tool to approach this type of computational system construction.

Publisher

Research Square Platform LLC

Reference133 articles.

1. V ásquez, Augusto Cortez and Huerta, Hugo Vega and Quispe, Jaime Pariona and Huayna, Ana Maria (2009) Procesamiento de lenguaje natural. Revista de investigaci ón de Sistemas e Inform ática 6(2): 45--54 30, 12, 2009-12-30, es, 1816-3823

2. Gelbukh, Alexander (2010) Procesamiento de lenguaje natural y sus aplicaciones. Komputer Sapiens 1: 6--11 5, 1, 2010-01-05, Sociedad Mexicana de Inteligencia Artificial, 2007-0691

3. Bagot, Rosa Estop à (2001) Extracci ón de terminolog ía: elementos para la construcci ón de un extractor. Tradterm 7: 225--250 https://doi.org/10.11606/issn.2317-9511.tradterm.2001.49149, 18, 12, 2001-12-18, pt, 2317-9511

4. Ziqi Zhang and Jie Gao and Fabio Ciravegna (2016) JATE 2.0: Java Automatic Term Extraction with Apache Solr. European Language Resources Association (ELRA), Paris, France, http://www.lrec-conf.org/proceedings/lrec2016/summaries/211.html, english, 978-2-9517408-9-1, Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis, Portorož, Slovenia, 23-28, may, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), 10th International Conference on Language Resources and Evaluation, LREC 2016

5. Barr ón Cede ño, Luis Alberto. Extracci ón autom ática de t érminos en contextos definitorios. 2007, 2007, M éxico, D.F., UNAM-Facultad de Ingenier ía, {M.sc}.{Thesis}, 112

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3