Chinese Comma Disambiguation in Math Word Problems Using SMOTE and Random Forests-Reference-Cited by-同舟云学术

Chinese Comma Disambiguation in Math Word Problems Using SMOTE and Random Forests

Published:2021-12-20 Issue:4 Volume:2 Page:738-755
ISSN:2673-2688
Container-title:AI
language:en
Short-container-title:AI

Author:

Huang Jingxiu^ORCID,Liu Qingtang^ORCID,Zheng Yunxiang^ORCID,Wu Linjing

Abstract

Natural language understanding technologies play an essential role in automatically solving math word problems. In the process of machine understanding Chinese math word problems, comma disambiguation, which is associated with a class imbalance binary learning problem, is addressed as a valuable instrument to transform the problem statement of math word problems into structured representation. Aiming to resolve this problem, we employed the synthetic minority oversampling technique (SMOTE) and random forests to comma classification after their hyperparameters were jointly optimized. We propose a strict measure to evaluate the performance of deployed comma classification models on comma disambiguation in math word problems. To verify the effectiveness of random forest classifiers with SMOTE on comma disambiguation, we conducted two-stage experiments on two datasets with a collection of evaluation measures. Experimental results showed that random forest classifiers were significantly superior to baseline methods in Chinese comma disambiguation. The SMOTE algorithm with optimized hyperparameter settings based on the categorical distribution of different datasets is preferable, instead of with its default values. For practitioners, we suggest that hyperparameters of a classification models be optimized again after parameter settings of SMOTE have been changed.

Funder

China Postdoctoral Science Foundation

National Natural Science Foundation of China

Publisher

MDPI AG

Link

https://www.mdpi.com/2673-2688/2/4/44/pdf

Reference38 articles.

1. A review of methods for automatic understanding of natural language mathematical problems

2. The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers

3. Rhetorical Structure Theory: Toward a functional theory of text organization