A Distributed Infrastructure for Earth-Science Big Data Retrieval

Author:

Liakos Panagiotis1,Koltsida Panagiota1,Kakaletris George1,Baumann Peter2,Ioannidis Yannis13,Delis Alex13

Affiliation:

1. Athena Research and Innovation Center, 15125 Maroussi, Greece

2. Jacobs University Bremen, 28759 Bremen, Germany

3. University of Athens, 15784 Athens, Greece

Abstract

Earth-Science data are composite, multi-dimensional and of significant size, and as such, continue to pose a number of ongoing problems regarding their management. With new and diverse information sources emerging as well as rates of generated data continuously increasing, a persistent challenge becomes more pressing: To make the information existing in multiple heterogeneous resources readily available. The widespread use of the XML data-exchange format has enabled the rapid accumulation of semi-structured metadata for Earth-Science data. In this paper, we exploit this popular use of XML and present the means for querying metadata emanating from multiple sources in a succinct and effective way. Thereby, we release the user from the very tedious and time consuming task of examining individual XML descriptions one by one. Our approach, termed Meta-Array Data Search (MAD Search), brings together diverse data sources while enhancing the user-friendliness of the underlying information sources. We gather metadata using different standards and construct an amalgamated service with the help of tools that discover and harvest such metadata; this service facilitates the end-user by offering easy and timely access to all metadata. The main contribution of our work is a novel query language termed xWCPS, that builds on top of two widely-adopted standards: XQuery and the Web Coverage Processing Service (WCPS). xWCPS furnishes a rich set of features regarding the way scientific data can be queried with. Our proposed unified language allows for requesting metadata while also giving processing directives. Consequently, the xWCPS-enabled MAD Search helps in both retrieval and processing of large data sets hosted in an heterogeneous infrastructure. We demonstrate the effectiveness of our approach through diverse use-cases that provide insights into the syntactic power and overall expressiveness of xWCPS. We evaluate MAD Search in a distributed environment that comprises five high-volume array-databases whose sizes range between 20 and 100 GB and so, we ascertain the applicability and potential of our proposal.

Publisher

World Scientific Pub Co Pte Lt

Subject

Computer Science Applications,Information Systems

Cited by 8 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Array databases: concepts, standards, implementations;Journal of Big Data;2021-02-02

2. INSPIRE coverages: an analysis and some suggestions;Open Geospatial Data, Software and Standards;2019-02-11

3. Interactive Data Exploration of Distributed Raw Files: A Systematic Mapping Study;IEEE Access;2019

4. Datacubes: Towards Space/Time Analysis-Ready Data;Lecture Notes in Geoinformation and Cartography;2018-06-08

5. A service-based framework for the OAIS model for earth science data management;Earth Science Informatics;2017-02-15

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3