Affiliation:
1. Auckland University of Technology, Auckland, New Zealand
Abstract
Online stream processing is an emerging research area in the field of computer science. Semi-stream processing is a particular type of stream processing where a stream of data is processed with a disk-based relation. A semi-stream join operator is required to implement this operation. Many semi-stream joins use a queue of stream tuples to amortize access cost for the disk-based relation, and use an index to allow directed access to the relation, avoiding the loading of unnecessary partition of [Formula: see text]. In such a situation, the question arises which [Formula: see text] partitions should be accessed, as any stream tuple from the queue could serve as a lookup element for accessing the relation index. Existing algorithms use simple safe and correct strategies, but are not optimal in the sense that they maximize the join service rate. This paper makes two contributions: first contribution is in terms of optimization in which we analyze strategies for selecting an appropriate lookup element, particularly for skewed stream data. We show that a good selection strategy can improve service rate of the existing join algorithms significantly. Second contribution is in terms of extension in which we develop multi-stage join for semi-stream join algorithms. Multi-stage join is important when stream data needs to be joined with two or more tables in the relation e.g., stream of sales data needs information to be added from product and customer tables in the relation. To the best of our knowledge, none of the existing algorithms implement this feature. For the service rate evaluation we use two well-performed existing algorithms CACHEJOIN and HYBRIDJOIN. We evaluate the service rate using real, TPC-H, and synthetic datasets with a known skewed distribution. We also present the cost model for our multi-stage join.
Publisher
World Scientific Pub Co Pte Lt
Subject
Computer Science (miscellaneous),Computer Science (miscellaneous)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献