Author:
Zhang Feng,Xue Hui-Feng,Xu Dong-Sheng,Zhang Yong-Heng,You Fei
Abstract
Big data cleaning is one of the important research issues in cloud computing theory. The existing data cleaning algorithms assume all the data can be loaded into the main memory at one-time, which are infeasible for big data. To this end, based on the knowledge base, a data cleaning algorithm is proposed in cloud computing by Map-Reduce. It extracts atomic knowledge of the selected nodes firstly, then analyzes their relations, deletes the same objects, builds an atomic knowledge sequence based on weights, lastly cleans data according to the sequence. The experimental results show that the cloud computing environment big data algorithm is effective and feasible, and has better expansibility.
Publisher
International Association of Online Engineering (IAOE)
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献