Linked Data Triples Enhance Document Relevance Classification-Reference-Cited by-同舟云学术

Linked Data Triples Enhance Document Relevance Classification

Published:2021-07-20 Issue:14 Volume:11 Page:6636
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Nagumothu Dinesh^ORCID,Eklund Peter W.^ORCID,Ofoghi Bahadorreza^ORCID,Bouadjenek Mohamed Reda

Abstract

Standardized approaches to relevance classification in information retrieval use generative statistical models to identify the presence or absence of certain topics that might make a document relevant to the searcher. These approaches have been used to better predict relevance on the basis of what the document is “about”, rather than a simple-minded analysis of the bag of words contained within the document. In more recent times, this idea has been extended by using pre-trained deep learning models and text representations, such as GloVe or BERT. These use an external corpus as a knowledge-base that conditions the model to help predict what a document is about. This paper adopts a hybrid approach that leverages the structure of knowledge embedded in a corpus. In particular, the paper reports on experiments where linked data triples (subject-predicate-object), constructed from natural language elements are derived from deep learning. These are evaluated as additional latent semantic features for a relevant document classifier in a customized news-feed website. The research is a synthesis of current thinking in deep learning models in NLP and information retrieval and the predicate structure used in semantic web research. Our experiments indicate that linked data triples increased the F-score of the baseline GloVe representations by 6% and show significant improvement over state-of-the art models, like BERT. The findings are tested and empirically validated on an experimental dataset and on two standardized pre-classified news sources, namely the Reuters and 20 News groups datasets.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/11/14/6636/pdf

Reference47 articles.

1. Predicting the Importance of Newsfeed Posts and Social Network Friends;Paek,2010

2. Automated Text Classification of News Articles: A Practical Guide

3. NLP in News Feedshttps://syncedreview.com/2019/01/12/nlp-in-news-feeds/

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Aspect-Based Fake News Detection;Lecture Notes in Computer Science;2024

2. Toward a Model to Evaluate Machine-Processing Quality in Scientific Documentation and Its Impact on Information Retrieval;Applied Sciences;2023-12-07

3. Semantic Enrichment of Taxonomy for BI Applications using Multifaceted data sources through NLP techniques;Procedia Computer Science;2022

4. Development and Evaluation of an Intelligence and Learning System in Jurisprudence Text Mining in the Field of Competition Defense;Applied Sciences;2021-12-01