Naive Bayesian Prediction of Japanese Annotated Corpus for Textual Semantic Word Formation Classification-Reference-Cited by-同舟云学术

Naive Bayesian Prediction of Japanese Annotated Corpus for Textual Semantic Word Formation Classification

Published:2022-03-16 Issue: Volume:2022 Page:1-14
ISSN:1563-5147
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Hao Zhoushao¹^ORCID

Affiliation:

1. Luoyang Normal University, Luoyang Henan 471934, China

Abstract

With the rapid development of Japanese information processing technology, problems such as polysemy and ambiguity at the text and dialogue level, as well as unregistered words, have become increasingly prominent because computers cannot fully “understand” the semantics of words. How to make the computer “understand” the semantics of words accurately requires the computer to “understand” the rules of converting and integrating words into words from the perspective of semantics. Traditional Japanese text classification mostly adopts the text representation method of vector space model, which has the problem of confusing classification effect. Therefore, this paper proposes the topic of constructing a semantic word formation pattern prediction model based on a large-scale annotated corpus. This paper proposes a solution that combines Japanese semantic word formation rules with pattern recognition algorithms. Aiming at this scheme, a variety of pattern recognition algorithms were compared and analyzed, and the naive Bayesian model was decided to predict semantic word formation patterns. This paper further improves the accuracy of computer prediction of Japanese semantic word formation patterns by adding part of speech. Before modeling, the parts of speech of words are automatically tagged and manually checked based on the original annotated corpus. In the research on predicting Japanese semantic word formation patterns, this paper builds a semantic word formation pattern prediction model based on Naive Bayes and conducts simulation experiments. We divide the eight types of semantic word formation patterns in the annotated corpus into two groups, and divide the obtained sample sets into training sets and test sets, so that the Naive Bayes model first learns semantic word formation rules based on the training sets of each group. Semantic word formation patterns are predicted on the test set for each group. The simulation results show that the prediction model of semantic word formation mode has a generally high degree of fit and prediction accuracy. The prediction model of semantic word formation pattern based on this theory can ensure that the computer can judge the semantic word formation pattern more accurately.

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2022/8048335.pdf

Reference25 articles.

1. Flame prediction based on harmful expression judgement using distributed representation;K. Matsumoto;International Journal of Technology and Engineering Studies,2018

2. Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction

3. The role of semantic processing in reading Japanese orthographies: an investigation using a script-switch paradigm

4. Enhancing Aspect-Based Sentiment Analysis of Arabic Hotels’ reviews using morphological, syntactic and semantic features

5. Understanding semantic accents in Japanese–English bilinguals: A feature-based approach

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Implementation of text mining for classification of drug effectiveness using the naïve bayes algorithm;AIP Conference Proceedings;2024

2. Prediction of Chinese Semantic Word Formation Patterns Based on Annotated Corpus;2023 3rd International Conference on Mobile Networks and Wireless Communications (ICMNWC);2023-12-04