Affiliation:
1. FORTH-ICS, Greece & University of Crete, Greece
2. FORTH-ICS, Greece
Abstract
In many applications, one has to fetch and assemble pieces of information coming from more than one source for building a semantic warehouse offering more advanced query capabilities. This chapter describes the corresponding requirements and challenges, and focuses on the aspects of quality, value and evolution of the warehouse. It details various metrics (or measures) for quantifying the connectivity of a warehouse and consequently the warehouse's ability to answer complex queries. The proposed metrics allow someone to get an overview of the contribution (to the warehouse) of each source and to quantify the value of the entire warehouse. Moreover, the paper shows how the metrics can be used for monitoring a warehouse after a reconstruction, thereby reducing the cost of quality checking and understanding its evolution over time. The behaviour of these metrics is demonstrated in the context of a real and operational semantic warehouse for the marine domain. Finally, the chapter discusses novel ways to exploit such metrics in global scale and for visualization purposes.
Reference51 articles.
1. W3C. (2013a). PROV Model Primer. W3C Working Group Note. Retrieved from: https://www.w3.org/TR/2013/NOTE-prov-primer-20130430/
2. W3C. (2013b). PROV-Overview: An Overview of the PROV Family of Documents. W3C Working Group Note. Retrieved from: https://www.w3.org/TR/prov-overview/
3. W3C. (2013c). The RDF Data Cube Vocabulary: W3C Proposed Recommendation. Retrieved from: https://www.w3.org/TR/2013/PR-vocab-data-cube-20131217/
4. LODStats – An Extensible Framework for High-Performance Dataset Analytics
5. Enhancing data quality in data warehouse environments