Abstract
AbstractHigh performance and high throughput computing (HPC/HTC) is challenged by ever increasing demands on the software stacks and more and more diverging requirements by different research communities. This led to a reassessment of the operational concept of HPC/HTC clusters at the Physikalisches Institut at the University of Bonn. As a result, the present HPC/HTC cluster (named BAF2) introduced various conceptual changes compared to conventional clusters. All jobs are now run in containers and a container-aware resource management system is used which allowed us to switch to a model without login/head nodes. Furthermore, a modern, feature-rich storage system with powerful interfaces has been deployed. We describe the design considerations, the implemented functionality and the operational experience gained with this new-generation setup which turned out to be very successful and well-accepted by its users.
Funder
Deutsche Forschungsgemeinschaft
Projekt DEAL
Publisher
Springer Science and Business Media LLC
Subject
Nuclear and High Energy Physics,Computer Science (miscellaneous),Software
Reference106 articles.
1. Albrecht J, Alves AA, Amadio G, Andronico G, Anh-Ky N, Aphecetche L, Apostolakis J, Asai M, Atzori L et al (2019) A roadmap for HEP software and computing R&D for the 2020s. Comput Softw Big Sci 3:1. https://doi.org/10.1007/s41781-018-0018-8
2. Huerta EA, Haas R, Jha S, Neubauer M, Katz DS (2019) Supporting high-performance and high-throughput computing for experimental science. Comput Softw Big Sci 3:1. https://doi.org/10.1007/s41781-019-0022-7
3. TORQUE/Maui. http://adaptivecomputing.com/cherry-services/torque-resource-manager. Accessed 20 Jan 2020
4. Lustre. http://lustre.org. Accessed 20 Jan 2020
5. OpenAFS. https://www.openafs.org. Accessed 20 Jan 2020
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献