A dependency-based machine learning approach to the identification of research topics: a case in COVID-19 studies-Reference-Cited by-同舟云学术

A dependency-based machine learning approach to the identification of research topics: a case in COVID-19 studies

Published:2021-08-24 Issue:ahead-of-print Volume:ahead-of-print Page:
ISSN:0737-8831
Container-title:Library Hi Tech
language:en
Short-container-title:LHT

Author:

Zhu Haoran^ORCID,Lei Lei^ORCID

Abstract

PurposePrevious research concerning automatic extraction of research topics mostly used rule-based or topic modeling methods, which were challenged due to the limited rules, the interpretability issue and the heavy dependence on human judgment. This study aims to address these issues with the proposal of a new method that integrates machine learning models with linguistic features for the identification of research topics.Design/methodology/approachFirst, dependency relations were used to extract noun phrases from research article texts. Second, the extracted noun phrases were classified into topics and non-topics via machine learning models and linguistic and bibliometric features. Lastly, a trend analysis was performed to identify hot research topics, i.e. topics with increasing popularity.FindingsThe new method was experimented on a large dataset of COVID-19 research articles and achieved satisfactory results in terms of f-measures, accuracy and AUC values. Hot topics of COVID-19 research were also detected based on the classification results.Originality/valueThis study demonstrates that information retrieval methods can help researchers gain a better understanding of the latest trends in both COVID-19 and other research areas. The findings are significant to both researchers and policymakers.

Publisher

Emerald

Subject

Library and Information Sciences,Information Systems

Reference46 articles.

1. Machine Learning for Text

2. A polynomial goal programming model for portfolio optimization based on entropy and higher moments;Expert Systems with Applications,2018

3. Review. Precision viticulture. Research topics, challenges and opportunities in site-specific vineyard management;Spanish Journal of Agricultural Research,2009

4. A comparison between morphological complexity measures: typological data vs Language corpora,2016

5. Setting parameters for support vector machines using transfer learning;Journal of Intelligent and Robotic Systems: Theory and Applications,2015

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The impact of COVID-19 on infodemic research: a bibliometric analysis of global publications;Library Hi Tech;2024-07-11

2. Prosody in linguistic journals: a bibliometric analysis;Humanities and Social Sciences Communications;2024-02-26

3. How is the development of library and information science in China?;Library Hi Tech;2023-08-31

4. Analyzing the spatiotemporal coupling relationship between public opinion and the epidemic during COVID-19;Library Hi Tech;2023-06-22

5. Editorial: Special selection on contemporary bibliometric analytics;Library Hi Tech;2023-06-01