Affiliation:
1. University of Edinburgh and Bell Laboratories
2. University of Edinburgh
Abstract
We present Semandaq, a prototype system for improving the quality of relational data. Based on the recently proposed
conditional functional dependencies
(CFDs), it detects and repairs errors and inconsistencies that emerge as violations of these constraints. We demonstrate the following functionalities supported by Semandaq: (a) an interface for specifying CFDs; (b) a visual tool for automated detection of CFD violations in relational data, leveraging efficient SQL-based techniques; (c) extensive visual data exploration capabilities that provide the user with various measures of the quality of the data; (d) repair (cleaning) functionality without excess human interaction, built upon CFD-based cleaning algorithms; we show how Semandaq allows for a natural exploration of the quality of the obtained repairs. Semandaq is a promising tool that provides easy access and user-friendly data quality facilities for any relational database system.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
17 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. BClean: A Bayesian Data Cleaning System;2024 IEEE 40th International Conference on Data Engineering (ICDE);2024-05-13
2. Automatic Rollback Suggestions for Incremental Datalog Evaluation;Practical Aspects of Declarative Languages;2023
3. DaQL 2.0: Measure Data Quality based on Entity Models;Procedia Computer Science;2021
4. Data Profiling;Synthesis Lectures on Data Management;2018-11-07
5. Relaxed Functional Dependencies—A Survey of Approaches;IEEE Transactions on Knowledge and Data Engineering;2016-01-01