Creating and validating a scholarly knowledge graph using natural language processing and microtask crowdsourcing-Reference-Cited by-同舟云学术

Creating and validating a scholarly knowledge graph using natural language processing and microtask crowdsourcing

Published:2023-04-05 Issue: Volume: Page:
ISSN:1432-5012
Container-title:International Journal on Digital Libraries
language:en
Short-container-title:Int J Digit Libr

Author:

Oelen Allard^ORCID,Stocker Markus,Auer Sören

Abstract

AbstractDue to the growing number of scholarly publications, finding relevant articles becomes increasingly difficult. Scholarly knowledge graphs can be used to organize the scholarly knowledge presented within those publications and represent them in machine-readable formats. Natural language processing (NLP) provides scalable methods to automatically extract knowledge from articles and populate scholarly knowledge graphs. However, NLP extraction is generally not sufficiently accurate and, thus, fails to generate high granularity quality data. In this work, we present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. TinyGenius is employed to populate a paper-centric knowledge graph, using five distinct NLP methods. We extend our previous work of the TinyGenius methodology in various ways. Specifically, we discuss the NLP tasks in more detail and include an explanation of the data model. Moreover, we present a user evaluation where participants validate the generated NLP statements. The results indicate that employing microtasks for statement validation is a promising approach despite the varying participant agreement for different microtasks.

Funder

European Research Council

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences

Link

https://link.springer.com/content/pdf/10.1007/s00799-023-00360-7.pdf

Reference38 articles.

1. Jinha, A.: Article 50 million: An estimate of the number of scholarly articles in existence. Learned Publishing 23(3), 258–263 (2010). https://doi.org/10.1087/20100308

2. Mons, B., Velterop, J.: Nano-publication in the e-science era. CEUR Workshop Proceedings 523 (2009)

3. Kuhn, T., Chichester, C., Krauthammer, M., Queralt-rosinach, N., Verborgh, R., Giannakopoulos, G.: Decentralized provenance-aware publishing with nanopublications, 1–29 (2016). https://doi.org/10.7717/peerj-cs.78

4. Stocker, M., Paasonen, P., Fiebig, M., Zaidan, M.A., Hardisty, A.: Curating scientific information in knowledge infrastructures. Data Science Journal 17 (2018). https://doi.org/10.5334/dsj-2018-021

5. Jaradeh, M.Y., Oelen, A., Farfar, K.E., Prinz, M., D’Souza, J., Kismihók, G., Stocker, M., Auer, S.: Open research knowledge graph: Next generation infrastructure for semantic scholarly knowledge. K-CAP 2019 - Proceedings of the 10th International Conference on Knowledge Capture, 243–246 (2019). https://doi.org/10.1145/3360901.3364435

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Leveraging Gaming to Enhance Knowledge Graphs for Explainable Generative AI Applications;2024 IEEE Conference on Games (CoG);2024-08-05

2. Editorial to the special issue on JCDL 2022;International Journal on Digital Libraries;2024-06

3. A Survey on Extracting Knowledge Graphs by Employing Natural Language Processing;2024 10th International Conference on Web Research (ICWR);2024-04-24