Storage Systems for Data Warehousing-Reference-Cited by-同舟云学术

Storage Systems for Data Warehousing

Published:2009 Issue: Volume: Page:1859-1864
ISSN:
Container-title:Encyclopedia of Data Warehousing and Mining, Second Edition
language:
Short-container-title:

Author:

Thomasian Alexander¹

Affiliation:

1. New Jersey Institute of Technology - NJIT, USA

Abstract

Data storage requirements have consistently increased over time. According to the latest WinterCorp survey (http://www/WinterCorp.com), “The size of the world’s largest databases has tripled every two years since 2001.” With database size in excess of 1 terabyte, there is a clear need for storage systems that are both cost effective and highly reliable. Historically, large databases are implemented on mainframe systems. These systems are large and expensive to purchase and maintain. In recent years, large data warehouse applications are being deployed on Linux and Windows hosts, as replacements for the existing mainframe systems. These systems are significantly less expensive to purchase while requiring less resources to run and maintain. With large databases it is less feasible, and less cost effective, to use tapes for backup and restore. The time required to copy terabytes of data from a database to a serial medium (streaming tape) is measured in hours, which would significantly degrade performance and decreases availability. Alternatives to serial backup include local replication, mirroring, or geoplexing of data. The increasing demands of larger databases must be met by less expensive disk storage systems, which are yet highly reliable and less susceptible to data loss. This article is organized into five sections. The first section provides background information that serves to introduce the concepts of disk arrays. The following three sections detail the concepts used to build complex storage systems. The focus of these sections is to detail: (i) Redundant Arrays of Independent Disks (RAID) arrays; (ii) multilevel RAID (MRAID); (iii) concurrency control and storage transactions. The conclusion contains a brief survey of modular storage prototypes.

Publisher

IGI Global

Reference14 articles.

1. Amiri, K., Gibson, G. A., & Golding, R. (2000). Highly concurrent shared storage. Proceedings 20th International Conference Distributed Computing Systems (pp. 298-307). Taiwan.

2. Baek, S. H., Kim, B. W., Jeung, E., & Park, C. W. (2001). Reliability and performance of hierarchical RAID with multiple controllers. Proceedings 20th Annual ACM Symposium on Principles of Distributed Computing (pp. 246-254). USA.

3. RAID: high-performance, reliable secondary storage

4. Reliability of modular mesh-connected intelligent storage brick systems.;C.Fleiner;IBM J. R&D,2006

5. Kenchmana-Hosekote, D. R., Golding, R. A., Fleiner, C., & Zaki, O. A. (2004). The design and evaluation of network RAID protocols. IBM Research Report RJ 10316, Almaden, CA.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Leasing as a Risk-Sharing Mechanism;SSRN Electronic Journal;2019