Think Twice: A Post-Processing Approach for the Chinese Spelling Error Correction-Reference-Cited by-同舟云学术

Think Twice: A Post-Processing Approach for the Chinese Spelling Error Correction

Published:2021-06-23 Issue:13 Volume:11 Page:5832
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Gou Wei^ORCID,Chen Zheng^ORCID

Abstract

Chinese Spelling Error Correction is a hot subject in the field of natural language processing. Researchers have already produced many great solutions, from the initial rule-based solution to the current deep learning method. At present, SpellGCN, proposed by Alibaba’s team, achieves the best results of which character level precision over SIGHAN2013 is 98.4%. However, when we apply this algorithm to practical error correction tasks, it produces many false error correction results. We believe that this is because the corpus used for model training contains significantly more errors than the text used for model correcting. In response to this problem, we propose performing a post-processing operation on the error correction tasks. We employ the initial model’s output as a candidate character, obtain various features of the character itself and its context, and then use a classification model to filter the initial model’s false error correction results. The post-processing idea introduced in this paper can apply to most Chinese Spelling Error Correction models to improve their performance over practical error correction tasks.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/11/13/5832/pdf

Reference38 articles.

1. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding;Devlin;arXiv,2018

2. Spelling Error Correction with Soft-Masked BERT;Zhang;arXiv,2020

3. SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check;Cheng;arXiv,2020

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. PERCORE: A Deep Learning-Based Framework for Persian Spelling Correction with Phonetic Analysis;International Journal of Computational Intelligence Systems;2024-05-08

2. MLSL-Spell: Chinese Spelling Check Based on Multi-Label Annotation;Applied Sciences;2024-03-18

3. Comparison Between Bi-Directional LSTM And Transfer Learning in Correcting Typing Errors on Twitter Social Media Posts;2023 11th International Conference on Cyber and IT Service Management (CITSM);2023-11-10

4. EDMSpell: Incorporating the error discriminator mechanism into chinese spelling correction for the overcorrection problem;Journal of King Saud University - Computer and Information Sciences;2023-06

5. A Comprehensive Dataset of Spelling Errors and Users’ Corrections in Croatian Language;Data;2023-05-12