Decision-Tree-Based Horizontal Fragmentation Method for Data Warehouses

Author:

Rodríguez-Mazahua NidiaORCID,Rodríguez-Mazahua LisbethORCID,López-Chau AsdrúbalORCID,Alor-Hernández GinerORCID,Machorro-Cano IsaacORCID

Abstract

Data warehousing gives frameworks and means for enterprise administrators to methodically prepare, comprehend, and utilize the data to improve strategic decision-making skills. One of the principal challenges to data warehouse designers is fragmentation. Currently, several fragmentation approaches for data warehouses have been developed since this technique can decrease the OLAP (online analytical processing) query response time and it provides considerable benefits in table loading and maintenance tasks. In this paper, a horizontal fragmentation method, called FTree, that uses decision trees to fragment data warehouses is presented to take advantage of the effectiveness that this technique provides in classification. FTree determines the OLAP queries with major relevance, evaluates the predicates found in the workload, and according to this, builds the decision tree to select the horizontal fragmentation scheme. To verify that the design is correct, the SSB (star schema benchmark) was used in the first instance; later, a tourist data warehouse was built, and the fragmentation method was tested on it. The results of the experiments proved the efficacy of the method.

Funder

National Council of Science and Technology

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Reference34 articles.

1. Ozsu, M.T., and Valduriez, P. Principles of Distributed Database Systems, 4th ed, 2020.

2. Daniel, C., Salamanca, E., and Nordlinger, B. Hospital Databases: AP-HP Clinical Data Warehouse. Healthcare and Artificial Intelligence, 2020.

3. Melton, J.E., Go, S., Zilliac, G.G., and Zhang, B.Z. Report NASA/TM-202220007609. Greenhouse Gas Emission Estimations for 2016–2020 using the Sherlock Air Traffic Data Warehouse, 2022.

4. Janzen, T.J., and Ristino, L. USDA and Agriculture Data: Improving Productivity while Protecting Privacy, 2018.

5. Han, J., Kamber, M., and Pei, J. Data Mining Concepts and Techniques, 3rd ed, 2012.

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3