Affiliation:
1. Department of Informatics and Statistics (INE), Distributed Systems Research Lab (LaPeSD) Federal University of Santa Catarina (UFSC) Florianópolis Brazil
Abstract
AbstractThe advent of cloud computing has made access to computing infrastructure available to millions of users that face resource constraints. In the context of high performance computing (HPC), public cloud resources have emerged as a cost‐effective alternative to expensive on‐premises clusters. However, there are several challenges and limitations in adopting this approach. This paper proposes HPC@Cloud , a provider‐agnostic open‐source software toolkit that facilitates the migration, testing, and execution of HPC applications in public clouds. The toolkit takes advantage of various fault tolerance technologies to enable the use of inexpensive transient cloud infrastructure, commonly known as “spot” instances. Also, it features integration with singularity containers, allowing users to run complex applications on virtual HPC clusters in a portable and reproducible way. Finally, it provides a data‐based empirical approach to estimating cloud infrastructure costs for HPC workloads. The results obtained on two public cloud providers (AWS and Vultr) show that: (i) HPC@Cloud can efficiently build virtual HPC clusters on the cloud; (ii) the new adaptive fault tolerance strategy outperforms other existing strategies based on blocking restoration; (iii) the integration of singularity containers into HPC@Cloud improves the portability of HPC applications to public clouds with negligible performance penalty to the applications; (iv) the proposed cost prediction approach can estimate the cost of running the applications on AWS and Vultr with up to 93% accuracy on average.
Funder
Amazon Web Services
Conselho Nacional de Desenvolvimento Científico e Tecnológico
Subject
Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications,Theoretical Computer Science,Software
Reference29 articles.
1. The NIST definition of cloud computing
2. A Manifesto for Future Generation Cloud Computing
3. Gartner.Gartner Says Worldwide IaaS Public Cloud Services Market Grew 41.4% in 2021;2022.https://www.gartner.com/en/newsroom/press‐releases/2022‐06‐02‐gartner‐says‐worldwide‐iaas‐public‐cloud‐services‐market‐grew‐41‐percent‐in‐2021
4. HPC Cloud for Scientific and Business Applications
5. HPC@Cloud: A Provider-Agnostic Software Framework for Enabling HPC in Public Cloud Platforms
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献