A review of data abstraction

Author:

Cima Gianluca,Console Marco,Lenzerini Maurizio,Poggi Antonella

Abstract

It is well-known that Artificial Intelligence (AI), and in particular Machine Learning (ML), is not effective without good data preparation, as also pointed out by the recent wave of data-centric AI. Data preparation is the process of gathering, transforming and cleaning raw data prior to processing and analysis. Since nowadays data often reside in distributed and heterogeneous data sources, the first activity of data preparation requires collecting data from suitable data sources and data services, often distributed and heterogeneous. It is thus essential that providers describe their data services in a way to make them compliant with the FAIR guiding principles, i.e., make them automatically Findable, Accessible, Interoperable, and Reusable (FAIR). The notion of data abstraction has been introduced exactly to meet this need. Abstraction is a kind of reverse engineering task that automatically provides a semantic characterization of a data service made available by a provider. The goal of this paper is to review the results obtained so far in data abstraction, by presenting the formal framework for its definition, reporting about the decidability and complexity of the main theoretical problems concerning abstraction, and discuss open issues and interesting directions for future research.

Publisher

Frontiers Media SA

Subject

Artificial Intelligence

Reference28 articles.

1. “Data profiling: a tutorial,”;Abedjan;Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD 2017),2017

2. Answering Queries Using Views, Second Edition

3. “The complexity of reverse engineering problems for conjunctive queries,“17 BarcelóP. RomeroM. Proceedings of the Twentieth International Conference on Database Theory (ICDT 2017), Volume 68 of Leibniz International Proceedings in Informatics72017

4. “EQL-lite: effective first-order query processing in description logics,”;Calvanese

5. “What is view-based query rewriting?” CalvaneseD. De GiacomoG. LenzeriniM. VardiM. Y. Proceedings of the Seventh International Workshop on Knowledge Representation meets Databases (KRDB 2000), Volume 29 of CEUR Electronic Workshop Proceedings2000

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3