Author:
Le Fèvre Valentin,Bosilca George,Bouteiller Aurelien,Herault Thomas,Hori Atsushi,Robert Yves,Dongarra Jack
Publisher
Springer International Publishing
Reference26 articles.
1. Amdahl, G.: The validity of the single processor approach to achieving large scale computing capabilities. In: AFIPS Conference Proceedings, vol. 30, pp. 483–485. AFIPS Press (1967)
2. Ashraf, R.A., Hukerikar, S., Engelmann, C.: Shrink or substitute: handling process failures in HPC systems using in-situ recovery. CoRR abs/1801.04523 (2018).
http://arxiv.org/abs/1801.04523
3. Bland, W., Bouteiller, A., Herault, T., Bosilca, G., Dongarra, J.: Post-failure recovery of MPI communication capability: design and rationale. Int. J. High Perform. Comput. Appl. 27(3), 244–254 (2013).
https://doi.org/10.1177/1094342013488238
,
http://hpc.sagepub.com/content/27/3/244.abstract
4. Cappello, F., Geist, A., Gropp, W., Kale, S., Kramer, B., Snir, M.: Toward exascale resilience: 2014 update. Supercomput. Front. Innov. 1(1), 5–28 (2014)
5. Cavelan, A., Li, J., Robert, Y., Sun, H.: When Amdahl meets Young/Daly. In: Cluster 2016. IEEE Computer Society Press (2016)