Knowledge-driven graph similarity for text classification-Reference-Cited by-同舟云学术

Knowledge-driven graph similarity for text classification

Published:2020-11-19 Issue:4 Volume:12 Page:1067-1081
ISSN:1868-8071
Container-title:International Journal of Machine Learning and Cybernetics
language:en
Short-container-title:Int. J. Mach. Learn. & Cyber.

Author:

Shanavas Niloofer^ORCID,Wang Hui,Lin Zhiwei,Hawe Glenn

Abstract

AbstractAutomatic text classification using machine learning is significantly affected by the text representation model. The structural information in text is necessary for natural language understanding, which is usually ignored in vector-based representations. In this paper, we present a graph kernel-based text classification framework which utilises the structural information in text effectively through the weighting and enrichment of a graph-based representation. We introduce weighted co-occurrence graphs to represent text documents, which weight the terms and their dependencies based on their relevance to text classification. We propose a novel method to automatically enrich the weighted graphs using semantic knowledge in the form of a word similarity matrix. The similarity between enriched graphs, knowledge-driven graph similarity, is calculated using a graph kernel. The semantic knowledge in the enriched graphs ensures that the graph kernel goes beyond exact matching of terms and patterns to compute the semantic similarity of documents. In the experiments on sentiment classification and topic classification tasks, our knowledge-driven similarity measure significantly outperforms the baseline text similarity measures on five benchmark text classification datasets.

Funder

Ulster University

University of Ulster

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Link

http://link.springer.com/content/pdf/10.1007/s13042-020-01221-4.pdf

Reference37 articles.

1. Altınel B, Diri B, Ganiz MC (2015) A novel semantic smoothing kernel for text classification with class-based weighting. Knowl Based Syst 89:265–277

2. Altınel B, Ganiz MC, Diri B (2015) A corpus-based semantic kernel for text classification by using meaning values of terms. Eng Appl Artif Intell 43:54–66

3. Bleik S, Mishra M, Huan J, Song M (2013) Text categorization of biomedical data sets using graph kernels and a controlled vocabulary. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 10(5):1211–1217

4. Blitzer J, Dredze M, Pereira F (2007) Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Association for Computational Linguistics, Prague, Czech Republic, pp 440–447

5. Bloehdorn S, Basili R, Cammisa M, Moschitti A (2006) Semantic kernels for text classification based on topological measures of feature similarity. In: Sixth International Conference on Data Mining (ICDM’06), pp 808–812

Cited by 24 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Learning by imitating the classics: Mitigating class imbalance in federated learning via simulated centralized learning;Expert Systems with Applications;2024-12

2. CDKT-FL: Cross-device knowledge transfer using proxy dataset in federated learning;Engineering Applications of Artificial Intelligence;2024-07

3. Preserving differential privacy in neural networks for foreign object detection with heterogeneity-based noising among distributed devices;The Journal of Supercomputing;2024-06-11

4. Global prototype distillation for heterogeneous federated learning;Scientific Reports;2024-05-27

5. FedBnR: Mitigating federated learning Non-IID problem by breaking the skewed task and reconstructing representation;Future Generation Computer Systems;2024-04