Affiliation:
1. Faculty of Engineering, Arak University, Sardasht Sq., Arak, Iran
Abstract
Due to the increasing growth of data, many methods are proposed to extract useful data and remove noisy data. Instance selection is one of these methods which selects some instances of a data set and removes others. This paper proposes a new instance selection algorithm based on ReliefF, which is a feature selection algorithm. In the proposed algorithm, based on the Jaccard index, the nearest instances of each class are found for each instance. Then, based on the nearest neighbor’s set, the weight of each instance is calculated. Finally, only instances with more weights are selected. This algorithm can reduce data at a specified rate and have the ability to run parallel on the instances. It can work on a variety of data sets with nominal and numeric data with missing values and is also suitable for working with imbalanced data sets. The proposed algorithm tests on three data sets. Results show that the proposed algorithm can reduce the volume of data, without a significant change in classification accuracy of these datasets.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Artificial Intelligence
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献