Affiliation:
1. V. R. Siddhartha Engineering College, India
2. Illinois State University, USA
Abstract
We are moving towards digitization and making all our devices, such as sensors and cameras, connected to internet, producing bigdata. This bigdata has variety of data and has paved the way to the emergence of NoSQL databases, like Cassandra, for achieving scalability and availability. Hadoop framework has been developed for storing and processing distributed data. In this chapter, the authors investigated the storage and retrieval of geospatial data by integrating Hadoop and Cassandra using prefix-based partitioning and Cassandra's default partitioning algorithm (i.e., Murmur3partitioner) techniques. Geohash value is generated, which acts as a partition key and also helps in effective search. Hence, the time taken for retrieving data is optimized. When users request spatial queries like finding nearest locations, searching in Cassandra database starts using both partitioning techniques. A comparison on query response time is made so as to verify which method is more effective. Results show the prefix-based partitioning technique is more efficient than Murmur3 partitioning technique.
Reference34 articles.
1. Hadoop GIS
2. Spatial data extension for Cassandra NoSQL database
3. Benkirane & Kettani. (n.d.). Retrieved from www.aui.ma/personal/~D.Kettani/courses/gis/GDB-benkirane.ppt
4. Fundamental operations in computer-assisted map analysis;J. K.Berry;International Journal of GIS,1987
5. Bobov. (n.d.). Spatial data Visualization. Retrieved from https://portal.opengeospatial.org/files/?artifact_id=73214