Author:
Shen Bo,Liao Yi-Chen,Liu Dan,Chao Han-Chieh
Abstract
Big data gathered from real systems, such as public infrastructure, healthcare, smart homes, industries, and so on, by sensor networks contain enormous value, and need to be mined deeply, which depends on a data storing and retrieving service. HBase is playing an increasingly important part in the big data environment since it provides a flexible pattern for storing extremely large amounts of unstructured data. Despite the fast-speed reading by RowKey, HBase does not natively support multi-conditional query, which is a common demand and operation in relational databases, especially for data analysis of ubiquitous sensing applications. In this paper, we introduce a method to construct a linear index by employing a Hilbert space-filling curve. As a RowKey generating schema, the proposed method maps multiple index-columns into a one-dimensional encoded sequence, and then constructs a new RowKey. We also provide a R-tree-based optimization to reduce the computational cost of encoding query conditions. Without using a secondary index mode, experimental results indicate that the proposed method has better performance in multi-conditional queries.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. An HBase-Based Optimization Model for Distributed Medical Data Storage and Retrieval;Electronics;2023-02-16
2. A Map Tile Data Access Model Based on the Jump Consistent Hash Algorithm;ISPRS International Journal of Geo-Information;2022-12-06
3. CSIndex: A Coprocessor-Based Classified Secondary Index Mechanism for Efficient HBase Query;2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom);2019-12