Affiliation:
1. Enterprise Data Warehouse Engineering-ETL, Intel Corporation, USA
Abstract
In a large data warehouse, thousands of jobs run during each cycle in dozens of subject areas. Many of the data warehouse tables are quite large and they need to be refreshed at the right time, several times a day, to support strategic business decisions. To enable cycles to run more frequently and keep the data warehouse environment stable the database system’s resource utilization must be optimal. This paper discusses refreshing data warehouses using a metadata model to make sure jobs under batch cycles run on an as-needed basis. The metadata model limits execution of the stored procedures in different analytical subject areas to source data changes in the source staging subject area tables, and then implements refreshes of analytical tables for which new data has arrived from the operational databases. The load is skipped if source data has not changed. Skipping unnecessary loads via this metadata driven approach enables significant database resources savings. The resource savings statistics based on an actual production data warehouse demonstrate an excellent reduction of computing resources consumption achieved by the proposed techniques.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献