Affiliation:
1. Universidade Federal do Rio de Janeiro
Abstract
This paper presents an improvement proposal for an ontology-driven multi-level conceptual model for the data catalogue domain. Data catalogues gather metadata that describe resources in different and heterogeneous digital platforms (repositories). They are supported by Information Systems (IS) that use these descriptors to provide visibility and support resources exploration and analysis. Domain ontologies are essential to promote quality ISs, as they are developed to reflect the intended reality. The proposed conceptual model is well-founded on the Unified Foundational Ontology and the Multi-Level Theory, based on the widely used DCAT vocabulary, a standardized metadata schema for describing datasets and data services. The resulting model addresses ambiguities and contemplates high-level types contributing to the conformance of domain concepts and relationships. In addition, they provide knowledge about the different types of resource descriptors and relationships contained in a specific catalogue, favoring its management. The paper enhances the previous model by extending it to handle descriptors representing a dataset according to the data equivalence across multiple distributions. We also demonstrate the model by describing a dataset with no data equivalence in its distributions, taken from a real-world scenario, thus providing a structured representation to manage metadata sets in the data catalogue domain.