Abstract
Tremendous quantities of numeric data have been generated as streams in various cyber ecosystems. Sorting is one of the most fundamental operations to gain knowledge from data. However, due to size restrictions of data storage which includes storage inside and outside CPU with respect to the massive streaming data sources, data can obviously overflow the storage. Consequently, all classic sorting algorithms of the past are incapable of obtaining a correct sorted sequence because data to be sorted cannot be totally stored in the data storage. This paper proposes a new sorting algorithm called streaming data sort for streaming data on a uniprocessor constrained by a limited storage size and the correctness of the sorted order. Data continuously flow into the storage as consecutive chunks with chunk sizes less than the storage size. A theoretical analysis of the space bound and the time complexity is provided. The sorting time complexity is O (n), where n is the number of incoming data. The space complexity is O (M), where M is the storage size. The experimental results show that streaming data sort can handle a million permuted data by using a storage whose size is set as low as 35% of the data size. This proposed concept can be practically applied to various applications in different fields where the data always overflow the working storage and sorting process is needed.
Funder
Development and Promotion of Science and Technology Talents Project
Thailand Research Fund
Reference40 articles.
1. Concom sorting algorithm;Agrawal,2015
2. Internet of things: a survey on enabling technologies, protocols, and applications;Al-Fuqaha;IEEE Communications Surveys & Tutorials,2015
3. Models and issues in data stream systems;Babcock,2002
4. Montres-nvm: an external sorting algorithm for hybrid memory;Bey Ahmed Khernache,2018
5. Quantiles on streams;Buragohain,2009
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献