A Framework for Indonesian Grammar Error Correction-Reference-Cited by-同舟云学术

A Framework for Indonesian Grammar Error Correction

Published:2021-07-31 Issue:4 Volume:20 Page:1-12
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Lin Nankai¹^ORCID,Chen Boyu¹,Lin Xiaotian¹,Wattanachote Kanoksak¹,Jiang Shengyi¹

Affiliation:

1. School of Computer Science and Technology, Guangdong University of Foreign Studies, Guangzhou, Guangdong, China

Abstract

Grammatical Error Correction (GEC) is a challenge in Natural Language Processing research. Although many researchers have been focusing on GEC in universal languages such as English or Chinese, few studies focus on Indonesian, which is a low-resource language. In this article, we proposed a GEC framework that has the potential to be a baseline method for Indonesian GEC tasks. This framework treats GEC as a multi-classification task. It integrates different language embedding models and deep learning models to correct 10 types of Part of Speech (POS) error in Indonesian text. In addition, we constructed an Indonesian corpus that can be utilized as an evaluation dataset for Indonesian GEC research. Our framework was evaluated on this dataset. Results showed that the Long Short-Term Memory model based on word-embedding achieved the best performance. Its overall macro-average F 0.5 in correcting 10 POS error types reached 0.551. Results also showed that the framework can be trained on a low-resource dataset.

Funder

National Natural Science Foundation of China

Major Projects of Guangdong Education Department for Foundation Research and Applied Research

Special Funds for the Cultivation of Guangdong College Student's Scientific and Technological Innovation

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3440993

Reference29 articles.

1. Comparative study of rule based approach for grammar checker;Baviskar Swapnali Deelip;Int. J. Manage. Technol. Eng.,2019

2. Ethnologue: Languages of the world of Asia;Simons Gary F.;SIL Int. Publ.,2005

3. Long Short-Term Memory

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Dynamic decoding and dual synthetic data for automatic correction of grammar in low-resource scenario;PeerJ Computer Science;2024-07-05

2. Detection of English Grammatical Errors and Correction using Graph Dual Encoder Decoder with Pyramid Attention Network;Rupkatha Journal on Interdisciplinary Studies in Humanities;2024-06-29

3. Research on error detection in English translation texts using machine learning algorithms;Intelligent Decision Technologies;2024-06-07

4. English Grammar Error Detection and Intelligent Assisted Correction Using Autoencoders;2024 International Conference on Machine Intelligence and Digital Applications;2024-05-30

5. Performance Evaluation and Improvement of Deep Echo State Network Models in English Writing Assistance and Grammar Error Correctionn;ICST Transactions on Scalable Information Systems;2024-03-01