Author:
Nematzadeh Zahra,Ibrahim Roliana,Selamat Ali,Nazerian Vahdat
Abstract
Purpose
The purpose of this study is to enhance data quality and overall accuracy and improve certainty by reducing the negative impacts of the FCM algorithm while clustering real-world data and also decreasing the inherent noise in data sets.
Design/methodology/approach
The present study proposed a new effective model based on fuzzy C-means (FCM), ensemble filtering (ENS) and machine learning algorithms, called an FCM-ENS model. This model is mainly composed of three parts: noise detection, noise filtering and noise classification.
Findings
The performance of the proposed model was tested by conducting experiments on six data sets from the UCI repository. As shown by the obtained results, the proposed noise detection model very effectively detected the class noise and enhanced performance in case the identified class noisy instances were removed.
Originality/value
To the best of the authors’ knowledge, no effort has been made to improve the FCM algorithm in relation to class noise detection issues. Thus, the novelty of existing research is combining the FCM algorithm as a noise detection technique with ENS to reduce the negative effect of inherent noise and increase data quality and accuracy.
Subject
Computational Theory and Mathematics,Computer Science Applications,General Engineering,Software
Reference61 articles.
1. Ensemble of classifiers for noise detection in pos tagged corpora,2000
2. Random forests;Machine Learning,2001
3. Identifying mislabeled training data;Journal of Artificial Intelligence Research,1999
4. Class noise detection based on software metrics and ROC curves;Information Sciences,2011
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献