A Novel Feature Selection Technique for Text Classification Using Naïve Bayes-Reference-Cited by-同舟云学术

A Novel Feature Selection Technique for Text Classification Using Naïve Bayes

Published:2014-10-29 Issue: Volume:2014 Page:1-10
ISSN:2356-7872
Container-title:International Scholarly Research Notices
language:en
Short-container-title:International Scholarly Research Notices

Author:

Dey Sarkar Subhajit¹,Goswami Saptarsi¹,Agarwal Aman¹,Aktar Javed¹

Affiliation:

1. Department of Computer Science and Engineering, Institute of Engineering and Management, West Bengal 700091, India

Abstract

With the proliferation of unstructured data, text classification or text categorization has found many applications in topic classification, sentiment analysis, authorship identification, spam detection, and so on. There are many classification algorithms available. Naïve Bayes remains one of the oldest and most popular classifiers. On one hand, implementation of naïve Bayes is simple and, on the other hand, this also requires fewer amounts of training data. From the literature review, it is found that naïve Bayes performs poorly compared to other classifiers in text classification. As a result, this makes the naïve Bayes classifier unusable in spite of the simplicity and intuitiveness of the model. In this paper, we propose a two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where we use the univariate feature selection method to reduce the search space and then apply clustering to select relatively independent feature sets. We demonstrate the effectiveness of our method by a thorough evaluation and comparison over 13 datasets. The performance improvement thus achieved makes naïve Bayes comparable or superior to other classifiers. The proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.

Publisher

Hindawi Limited

Subject

General Medicine

Link

http://downloads.hindawi.com/archive/2014/717092.pdf

Reference17 articles.

1. Effective Methods for Improving Naive Bayes Text Classifiers

2. Empirical Study on Filter based Feature Selection Methods for Text Classification

Cited by 43 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A two-stage feature selection approach using hybrid elitist self-adaptive cat and mouse based optimization algorithm for document classification;Expert Systems with Applications;2024-11

2. Predictors of In-Hospital Mortality after Thrombectomy in Anterior Circulation Large Vessel Occlusion: A Retrospective, Machine Learning Study;Diagnostics;2024-07-16

3. A novel two-stage wrapper feature selection approach based on greedy search for text sentiment classification;Neurocomputing;2024-07

4. Feature Selection for Data Classification in the Semiconductor Industry by a Hybrid of Simplified Swarm Optimization;Electronics;2024-06-07

5. Reliable feature selection for adversarially robust cyber-attack detection;Annals of Telecommunications;2024-06-07