Funder
Agencia Estatal de Investigación
Publisher
Springer Science and Business Media LLC
Subject
Hardware and Architecture,Information Systems,Theoretical Computer Science,Software
Reference37 articles.
1. León B, Franco D, Rexachs D, Luque E (2018) Characterization of I/O Patterns generated by Fault Tolerance in HPC environments. International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA) vol 18, p 28
2. Lemarinier Bouteiller, Capello Krawezik (2003) Coordinated checkpoint versus message log for fault tolerant MPI, in 2003 Proceedings IEEE International Conference on Cluster Computing, pp. 242–250. https://doi.org/10.1109/CLUSTR.2003.1253321
3. Shahzad F, Thies J, Kreutzer M, Zeiser T, Hager G, Wellein G (2019) CRAFT: a library for easier application-level checkpoint/restart and automatic fault tolerance. IEEE Trans Parallel Distrib Syst 30(3):501. https://doi.org/10.1109/TPDS.2018.2866794
4. Coti C, Herault T, Lemarinier P, Pilard L, Rezmerita A, Rodriguez E, Cappello F (2006) Blocking vs. Non-Blocking Coordinated Checkpointing for Large-Scale Fault Tolerant MPI, In: SC ’06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, pp. 18–18. https://doi.org/10.1109/SC.2006.15
5. Moríñigo JA, Rodríguez-Pascual M, Mayo-García R (2019) On the modelling of optimal coordinated checkpoint period in supercomputers. J Supercomput 75(2):930
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献