Author:
Yao Zhixin,Zhang Jianqin,Li Taizeng,Ding Ying
Abstract
Trajectory big data is suitable for distributed storage retrieval due to its fast update speed and huge data volume, but currently there are problems such as hot data writing, storage skew, high I/O overhead and slow retrieval speed. In order to solve the above problems, this paper proposes a trajectory big data model that incorporates data partitioning and spatio-temporal multi-perspective hierarchical organization. At the spatial level, the model partitions the trajectory data based on the Hilbert curve and combines the pre-partitioning mechanism to solve the problems of hot writing and storage skewing of the distributed database HBase; at the temporal level, the model takes days as the organizational unit, finely encodes them into a minute system and then fuses the data partitioning to build spatio-temporal hybrid encoding to hierarchically organize the trajectory data and solve the problems of efficient storage and retrieval of trajectory data. The experimental results show that the model can effectively improve the storage and retrieval speed of trajectory big data under different orders of magnitude, while ensuring relatively stable writing and query speed, which can provide an efficient data model for trajectory big data mining and analysis.
Funder
the Beijing Natural Science Foundation
the National Natural Science Foundation of China
Subject
Earth and Planetary Sciences (miscellaneous),Computers in Earth Sciences,Geography, Planning and Development
Reference38 articles.
1. Zhou, Y., Chen, Q., Shan, B., Jiang, F., and Pang, Y. (August, January 28). A Distributed Storage Strategy for Trajectory Data Based On Nosql Database. Proceedings of the IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan.
2. A Survey of Spatio-Temporal Big Data Indexing Methods in Distributed Environment;Tian;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2022
3. Influence of pre-processing strategies on the performance of ML classifiers exploiting TF-IDF and BOW features;Pimpalkar;ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J.,2020
4. Using Hilbert curve and Cassandra technology to realize spatiotemporal big data storage and indexing;Cao;J. Wuhan Univ.,2021
5. Geohash coding organization and efficient range query of large-scale trajectory data;Xiang;J. Wuhan Univ.,2017
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献