The Effect of System Utilization on Application Performance Variability

Author:

Li Boyang1,Chunduri Sudheer2,Harms Kevin2,Fan Yuping1,Lan Zhiling1

Affiliation:

1. Illinois Institute of Technology, Chicago, IL, USA

2. Argonne National Laboratory, Lemont, IL, USA

Funder

US National Science Foundation

U.S. Department of Energy

Publisher

ACM Press

Reference21 articles.

1. William Allcock, Paul Rich, Yuping Fan, and Zhiling Lan. 2017. Experience and Practice of Batch Scheduling on Leadership Supercomputers at Argonne. In Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP) .

2. Abhinav Bhatele, Nikhil Jain, Yarden Livnat, Valerio Pascucci, and Peer-Timo Bremer. 2016. Analyzing network health and congestion in dragonfly-based supercomputers. In Parallel and Distributed Processing Symposium, 2016 IEEE International. IEEE, 93--102.

3. Abhinav Bhatele, Kathryn Mohror, Steven~H Langer, and Katherine~E Isaacs. 2013. There goes the neighborhood: performance degradation due to nearby jobs. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. ACM, 41.

4. Sudheer Chunduri, Kevin Harms, Scott Parker, Vitali Morozov, Samuel Oshin, Naveen Cherukuri, and Kalyan Kumaran. 2017. Run-to-run variability on Xeon Phi based Cray XC systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 52.

5. Wayne W Daniel and Chad L Cross. 2012. Biostatistics: a Foundation for Analysis in the health sciences .Wiley Global Education.

Cited by 12 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling;2023 31st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS);2023-10-16

2. Machine Learning for Interconnect Network Traffic Forecasting: Investigation and Exploitation;ACM SIGSIM Conference on Principles of Advanced Discrete Simulation;2023-06-21

3. Exploring Machine Learning Models with Spatial-Temporal Information for Interconnect Network Traffic Forecasting;ACM SIGSIM Conference on Principles of Advanced Discrete Simulation;2023-06-21

4. DRAS: Deep Reinforcement Learning for Cluster Scheduling in High Performance Computing;IEEE Transactions on Parallel and Distributed Systems;2022-12-01

5. Study of Workload Interference with Intelligent Routing on Dragonfly;SC22: International Conference for High Performance Computing, Networking, Storage and Analysis;2022-11

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3