Learning Chinese Word Segmentation Based on Bidirectional GRU-CRF and CNN Network Model-Reference-Cited by-同舟云学术

Learning Chinese Word Segmentation Based on Bidirectional GRU-CRF and CNN Network Model

Published:2019-07 Issue:3 Volume:15 Page:47-62
ISSN:1548-3908
Container-title:International Journal of Technology and Human Interaction
language:en
Short-container-title:

Author:

Yu Chenghai¹,Wang Shupei¹,Guo Jiajun¹

Affiliation:

1. Zhejiang Sci-Tech University, Zhejiang, China

Abstract

Chinese word segmentation is the basis of the Chinese natural language processing (NLP). With the development of the deep learning, various neural network models are applied to the Chinese word segmentation. However, current neural network models have the characteristics of artificial feature extraction, nonstandard word-weight, inability to effectively use long-distance information and long training time of models in Chinese word segmentation. To solve a series of problems, this article presents a CNN-Bidirectional GRU-CRF neural network model (CNN Bidirectional GRU CRF Network, CBiGCN), which breaks through the limit of conventional method window, truly realizes end-to-end processing and applies to the neural network model by the five-Tag set method, bias-variable-weight greedy strategy and supplements by Goldstein-Armijo guidelines. Besides, this model, with simple structure, is easy to be operated. And it can automatically learn features, reduces large amounts of tasks on specific knowledge in the form of handcrafted features and data pre-processing, makes use of context information effectively. The authors set an experiment with two data corpuses for Chinese word segmentation to evaluate their system. The experiment verified their new model can obtain better Chinese word segmentation results and greatly reduce training time.

Publisher

IGI Global

Subject

Human-Computer Interaction,Information Systems

Reference25 articles.

1. Automatic Segmentation of Chinese Characters as Wire-Frame Models

2. Long Short-Term Memory Neural Networks for Chinese Word Segmentation

3. Chinese Information Society. (2016). Chinese Information Processing Development Report. Retrieved from http://www.cipsc.org.cn/

4. CSDN.net. (n.d.). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[OL]. Retrieved from http://blog.csdn.net/meanme/article/details/48845793

5. Novel conditional random field model extended by tensor and its application in natural language processing tasks.;Y.Feng;Jisuanji Yingyong Yanjiu,2016

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A false emotion opinion target extraction model with two stage BERT and background information fusion;Expert Systems with Applications;2024-09

2. Fast Recurrent Neural Network with Bi-LSTM for Handwritten Tamil Text Segmentation in NLP;ACM Transactions on Asian and Low-Resource Language Information Processing;2024-05-10

3. Text Classification of Mixed Model Based on Deep Learning;Tehnički glasnik;2023-07-19

4. Developing a novel hybrid Auto Encoder Decoder Bidirectional Gated Recurrent Unit model enhanced with empirical wavelet transform and Boruta-Catboost to forecast significant wave height;Journal of Cleaner Production;2022-12

5. A GAT-Based Chinese Text Classification Model: Using of Redical Guidance and Association Between Characters Across Sentences;Knowledge Science, Engineering and Management;2022