Application of an Improved CHI Feature Selection Algorithm-Reference-Cited by-同舟云学术

Application of an Improved CHI Feature Selection Algorithm

Published:2021-05-13 Issue: Volume:2021 Page:1-8
ISSN:1607-887X
Container-title:Discrete Dynamics in Nature and Society
language:en
Short-container-title:Discrete Dynamics in Nature and Society

Author:

Cai Liang-jing¹^ORCID,Lv Shu¹^ORCID,Shi Kai-bo²^ORCID

Affiliation:

1. School of Mathematical Sciences, University of Electronic Science and Technology of China, Sichuan, Chengdu 611731, China

2. School of Electronic Information and Electrical Engineering, Chengdu University, Sichuan, Chengdu 610106, China

Abstract

Text classification is the critical content of machine learning, and it is widely applied in information filtering, sentimental analysis, and text review. It is very important to improve the accuracy of classification results, and this is also the main research purpose of researchers in this field in recent years. Feature selection plays an important role in text classification, which has the functions of eliminating irrelevant features, reducing dimensionality, and improving classification accuracy. So, this paper studies the CHI feature selection algorithm, and the main work and innovations are as follows: firstly, this paper analyzed the CHI algorithm’s flaws, determined that the introduction of new parameters will be the improvement direction of the CHI algorithm, and thus proposed a new algorithm based on variance and coefficient of variation. Secondly, experiment to verify the effectiveness of the new algorithm. In terms of language, the experiment in this paper includes two text classification systems, which were Chinese and English. In terms of classifiers, two classifier algorithms were used, which included the KNN classifier and the Naive Bayes classifier. In terms of data types, two distribution types of data were used: balanced datasets and unbalanced datasets. Finally, experiment and result analysis. This paper has conducted 3 comparative experiments and analyzed the results of each experiment. The experimental results obtained are all significantly improved compared to the results before the improvement.

Publisher

Hindawi Limited

Subject

Modelling and Simulation

Link

http://downloads.hindawi.com/journals/ddns/2021/9963382.pdf

Reference14 articles.

1. On Two-Stage Feature Selection Methods for Text Classification

2. Study on feature selection in Chinese text categorization;Q. Zhou;Journal of Chinese Information Processing,2004

3. Chinese Public's Attention to the COVID-19 Epidemic on Social Media: Observational Descriptive Study

4. Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification

5. Improved CHI text feature selection based on word frequency information;H. Liu;Computer Engineering and Applications,2013

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Classification of imbalanced datasets utilizing the synthetic minority oversampling method in conjunction with several machine learning techniques;Iran Journal of Computer Science;2024-09-11

2. Deep Learning for Enhanced IoMT Security: A GNN-BiLSTM Intrusion Detection System;2024 International Conference on Circuit, Systems and Communication (ICCSC);2024-06-28

3. Identifying Key Learning Algorithm Parameter of Forward Feature Selection to Integrate with Ensemble Learning for Customer Churn Prediction;VFAST Transactions on Software Engineering;2024-06-11

4. Differential diagnosis of erythemato-squamous diseases using a hybrid ensemble machine learning technique;Intelligent Decision Technologies;2024-06-07

5. A metaheuristic based filter-wrapper approach to feature selection for fake news detection;Multimedia Tools and Applications;2024-03-05