A hybrid Chinese word segmentation model for quality management-related texts based on transfer learning-Reference-Cited by-同舟云学术

A hybrid Chinese word segmentation model for quality management-related texts based on transfer learning

Published:2022-10-07 Issue:10 Volume:17 Page:e0270154
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Wen Peihan^ORCID,Feng Linhan,Zhang Tian

Abstract

Text information mining is a key step to data-driven automatic/semi-automatic quality management (QM). For Chinese texts, a word segmentation algorithm is necessary for pre-processing since there are no explicit marks to define word boundaries. Because of intrinsic characteristics of QM-related texts, word segmentation algorithms for normal Chinese texts cannot be directly applied. Hence, based on the analysis of QM-related texts, we summarized six features, and proposed a hybrid Chinese word segmentation model by means of integrating transfer learning (TL), bidirectional long-short term memory (Bi-LSTM), multi-head attention (MA), and conditional random field (CRF) to construct the mTL-Bi-LSTM-MA-CRF model, considering insufficient samples of QM-related texts and excessive cutting of idioms. The mTL-Bi-LSTM-MA-CRF model is composed of two steps. Firstly, based on a word embedding space, the Bi-LSTM is introduced for context information learning, and the MA mechanism is selected to allocate attention among subspaces, and then the CRF is used to learn label sequence constraints. Secondly, a modified TL method is put forward for text feature extraction, adaptive layer weights learning, and loss function correction for selective learning. Experimental results show that the proposed model can achieve good word segmentation results with only a relatively small set of samples.

Funder

National Key Research and Development Program of China

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference42 articles.

1. Opportunities and challenges of artificial intelligence for green manufacturing in the process industry;S Mao;Engineering,2019

2. Geometric deep lean learning: Deep learning in industry 4.0 cyber–physical complex networks;J Villalba-Díez;Sensors,2020

3. Fault detection in Tennessee Eastman process with temporal deep learning models;I Lomov;Journal of Industrial Information Integration,2021

4. Predicting and optimizing the thermal-hydraulic, natural circulation, and neutronics parameters in the NuScale nuclear reactor using nanofluid as a coolant via machine learning methods through GA, PSO and HPSOGA algorithms;Z Rahnama;Annals of Nuclear Energy,2021

5. A neural network model for free-falling condensation heat transfer in the presence of non-condensable gases;E Cho;International Journal of Thermal Sciences,2022

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Do consumers concern about energy saving in purchasing energy-efficient home appliances? Evidence from online e-commerce review;Energy Policy;2024-10

2. Research on Chinese Word Segmentation Algorithm in the Tobacco Field Based on the BERT-BiLSTM-CRF Model;Lecture Notes in Electrical Engineering;2024

3. Systematic knowledge modeling and extraction methods for manufacturing process planning based on knowledge graph;Advanced Engineering Informatics;2023-10

4. Quantitative Evaluation of Pharmaceutical Industry in Jilin Province Based on Text Mining;Proceedings of the 2023 4th International Conference on Big Data and Informatization Education (ICBDIE 2023);2023-09-26