Affiliation:
1. Department of Computer Science Colorado State University Fort Collins Colorado USA
2. Spectra Logic Boulder Colorado USA
3. Department of Biosystems Engineering University of Arizona Tucson Arizona USA
4. College of Agriculture Colorado State University Fort Collins Colorado USA
Abstract
SummaryRemote sensing of plant traits and their environment facilitates non‐invasive, high‐throughput monitoring of the plant's physiological characteristics. However, voluminous observational data generated by such autonomous sensor networks overwhelms scientific users when they have to analyze the data. In order to provide a scalable and effective analysis environment, there is a need for storage and analytics that support high‐throughput data ingestion while preserving spatiotemporal and sensor‐specific characteristics. Also, the framework should enable modelers and scientists to run their analytics while coping with the fast and continuously evolving nature of the dataset. In this paper, we present Radix+, a high‐throughput distributed data storage system for supporting scalable georeferencing, and interactive query‐based spatiotemporal analytics with trackable data integrity. We include empirical evaluations performed on a commodity machine cluster with up to 1 TB of data. Our benchmarks demonstrate subsecond latency for majority of our evaluated queries and improvement in data ingestion rate over systems such as Geomesa.
Funder
National Institute of Food and Agriculture
National Science Foundation of Sri Lanka
Subject
Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications,Theoretical Computer Science,Software