Affiliation:
1. Bilkent University, Turkey
2. Louisiana State University, USA
3. State University of New York at Buffalo (SUNY), USA
Abstract
The unbounded increase in the size of data generated by scientific applications necessitates collaboration and sharing among the nation’s education and research institutions. Simply purchasing high-capacity, high-performance storage systems and adding them to the existing infrastructure of the collaborating institutions does not solve the underlying and highly challenging data handling problem. Scientists are compelled to spend a great deal of time and energy on solving basic data-handling issues, such as the physical location of data, how to access it, and/or how to move it to visualization and/or compute resources for further analysis. This chapter presents the design and implementation of a reliable and efficient distributed data storage system, PetaShare, which spans multiple institutions across the state of Louisiana. At the back-end, PetaShare provides a unified name space and efficient data movement across geographically distributed storage sites. At the front-end, it provides light-weight clients the enable easy, transparent, and scalable access. In PetaShare, the authors have designed and implemented an asynchronously replicated multi-master metadata system for enhanced reliability and availability. The authors also present a high level cross-domain metadata schema to provide a structured systematic view of multiple science domains supported by PetaShare.
Reference28 articles.
1. Allen, G., Goodale, T., Masso, J., & Seidel, E. (1999). The Cactus computational toolkit and using distributed computing to collide neutron stars. Proceedings of Eighth IEEE International Symposium on High Performance Distributed Computing, HPDC-8. IEEE Computer Society.
2. Allen, G., MacMahon, C., Seidel, E., & Tierney, T. (2003). LONI: Louisiana optical network initiative. Received from http://www.cct.lsu.edu/~gallen/Reports/LONI ConceptPaper.pdf
3. Carena, F., Carena, W., Chapeland, S., Divia, R., Fuchs, U., & Makhlyueva, I. … Vyvre, P. V. (2008). The ALICE DAQ online transient data storage system. Journal of Physics: Conference Series, 119(2), 022016 (7pp).
4. Prediction of Wind Waves in a Shallow Estuary
5. The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets