A Graph Convolutional Network-Based Sensitive Information Detection Algorithm-Reference-Cited by-同舟云学术

A Graph Convolutional Network-Based Sensitive Information Detection Algorithm

Published:2021-03-24 Issue: Volume:2021 Page:1-8
ISSN:1099-0526
Container-title:Complexity
language:en
Short-container-title:Complexity

Author:

Liu Ying¹,Yang Chao-Yu¹^ORCID,Yang Jie²

Affiliation:

1. School of Economics and Management, Anhui University of Science and Technology, Huainan, China

2. Faculty of Engineering and Information Sciences, School of Computing and Information Technology, University of Wollongong, Wollongong, NSW, Australia

Abstract

In the field of natural language processing (NLP), the task of sensitive information detection refers to the procedure of identifying sensitive words for given documents. The majority of existing detection methods are based on the sensitive-word tree, which is usually constructed via the common prefixes of different sensitive words from the given corpus. Yet, these traditional methods suffer from a couple of drawbacks, such as poor generalization and low efficiency. For improvement purposes, this paper proposes a novel self-attention-based detection algorithm using the implementation of graph convolutional network (GCN). The main contribution is twofold. Firstly, we consider a weighted GCN to better encode word pairs from the given documents and corpus. Secondly, a simple, yet effective, attention mechanism is introduced to further integrate the interaction among candidate words and corpus. Experimental results from the benchmarking dataset of THUC news demonstrate a promising detection performance, compared to existing work.

Publisher

Hindawi Limited

Subject

Multidisciplinary,General Computer Science

Link

http://downloads.hindawi.com/journals/complexity/2021/6631768.pdf

Reference19 articles.

1. Indirect associations in learning semantic and syntactic lexical relationships

2. Precise detection of Chinese characters in historical documents with deep reinforcement learning

3. Characteristics of Chinese Online Movie Reviews and Opinion Leadership Identification

4. Tibetan text classification using distributed representations of words

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Sensitive Word Recognition Scheme Based on Text-RCNN Model;2023 International Conference on Data Science and Network Security (ICDSNS);2023-07-28

2. Knowledge-Graph- and GCN-Based Domain Chinese Long Text Classification Method;Applied Sciences;2023-07-06