Proactive Fault Tolerance in MPI Applications Via Task Migration

Author:

Chakravorty Sayantan,Mendes Celso L.,Kalé Laxmikant V.

Publisher

Springer Berlin Heidelberg

Reference25 articles.

1. Gropp, W., Lusk, E., Skjellum, A.: Using MPI, 2nd edn. MIT Press, Cambridge (1999)

2. Gropp, W., Lusk, E.: Fault tolerance in message passing interface programs. International Journal of High Performance Computing Applications 18(3), 363–372 (2004)

3. Huang, C.: System support for checkpoint and restart of Charm++ and AMPI applications. Master’s thesis, Dep. of Computer Science, University of Illinois, Urbana, IL (2004), Available at: http://charm.cs.uiuc.edu/papers/CheckpointThesis.html

4. Zheng, G., Shi, L., Kalé, L.V.: FTC-Charm++: An in-memory checkpoint-based fault tolerant runtime for Charm++ and MPI. In: 2004 IEEE International Conference on Cluster Computing, San Diego, CA (2004)

5. Chakravorty, S., Kalé, L.V.: A fault tolerant protocol for massively parallel machines. In: FTPDS Workshop at IPDPS 2004, Santa Fe, NM. IEEE Press, Los Alamitos (2004)

Cited by 14 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. A New Fault-Tolerant Algorithm Based on Replication and Preemptive Migration in Cloud Computing;International Journal of Cloud Applications and Computing;2022-07-22

2. Classification of Resilience Techniques Against Functional Errors at Higher Abstraction Layers of Digital Systems;ACM Computing Surveys;2018-07-31

3. What does fault tolerant deep learning need from MPI?;Proceedings of the 24th European MPI Users' Group Meeting on - EuroMPI '17;2017

4. Optimizing the fault-tolerance overheads of HPC systems using prediction and multiple proactive actions;The Journal of Supercomputing;2015-06-04

5. Fault-Tolerant MPI;Computer Communications and Networks;2015

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3