Affiliation:
1. Department of Computer Science, University of California, Irvine
Abstract
This paper explores "on-the-fly" data cleaning in the context of a user query. A novel Query-Driven Approach (QDA) is developed that performs a minimal number of cleaning steps that are only necessary to answer a given selection query correctly. The comprehensive empirical evaluation of the proposed approach demonstrates its significant advantage in terms of efficiency over traditional techniques for query-driven applications.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
36 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A Suite of Efficient Randomized Algorithms for Streaming Record Linkage;IEEE Transactions on Knowledge and Data Engineering;2024-07
2. ZIP: Lazy Imputation during Query Processing;Proceedings of the VLDB Endowment;2023-09
3. BrewER: Entity Resolution On-Demand;Proceedings of the VLDB Endowment;2023-08
4. A Randomized Blocking Structure for Streaming Record Linkage;Proceedings of the VLDB Endowment;2023-07
5. Metam: Goal-Oriented Data Discovery;2023 IEEE 39th International Conference on Data Engineering (ICDE);2023-04