A New Method for Graph-Based Representation of Text in Natural Language Processing-Reference-Cited by-同舟云学术

A New Method for Graph-Based Representation of Text in Natural Language Processing

Published:2023-06-27 Issue:13 Volume:12 Page:2846
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Probierz Barbara¹²^ORCID,Hrabia Anita¹^ORCID,Kozak Jan¹²^ORCID

Affiliation:

1. Department of Machine Learning, University of Economics in Katowice, 1 Maja 50, 40-287 Katowice, Poland

2. Łukasiewicz Research Network—Institute of Innovative Technologies EMAG, Leopolda 31, 40-189 Katowice, Poland

Abstract

Natural language processing is still an emerging field in machine learning. Access to more and more data sets in textual form, new applications for artificial intelligence and the need for simple communication with operating systems all simultaneously affect the importance of natural language processing in evolving artificial intelligence. Traditional methods of textual representation, such as Bag-of-Words, have some limitations that result from the lack of consideration of semantics and dependencies between words. Therefore, we propose a new approach based on graph representations, which takes into account both local context and global relationships between words, allowing for a more expressive textual representation. The aim of the paper is to examine the possibility of using graph representations in natural language processing and to demonstrate their use in text classification. An innovative element of the proposed approach is the use of common cliques in graphs representing documents to create a feature vector. Experiments confirm that the proposed approach can improve classification efficiency. The use of a new text representation method to predict book categories based on the analysis of its content resulted in accuracy, precision, recall and an F1-score of over 90%. Moving from traditional approaches to a graph-based approach could make a big difference in natural language processing and text analysis and could open up new opportunities in the field.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/12/13/2846/pdf

Reference67 articles.

1. Bales, M.E., Wright, D.N., Oxley, P.R., and Wheeler, T.R. (2020). Bibliometric Visualization and Analysis Software: State of the Art, Workflows, and Best Practices, Cornell University.

2. Jaradeh, M.Y., Oelen, A., Farfar, K.E., Prinz, M., D’Souza, J., Kismihók, G., Stocker, M., and Auer, S. (2019, January 19–21). Open research knowledge graph: Next generation infrastructure for semantic scholarly knowledge. Proceedings of the 10th International Conference on Knowledge Capture, Marina Del Rey, CA, USA.

3. Topic and sentiment aware microblog summarization for twitter;Ali;J. Intell. Inf. Syst.,2020

4. Wanigasooriya, A., and Silva, W.P.D. (2021). Automated Text Classification of Library Books into the Dewey Decimal Classification (DDC), University of Kelaniya.

5. Advances in natural language processing;Hirschberg;Science,2015

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An Attention-Based Method for the Minimum Vertex Cover Problem on Complex Networks;Algorithms;2024-02-06