Kairos: Building Cost-Efficient Machine Learning Inference Systems with Heterogeneous Cloud Resources-Reference-Cited by-同舟云学术

Kairos: Building Cost-Efficient Machine Learning Inference Systems with Heterogeneous Cloud Resources

Published:2023-08-07 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing
language:
Short-container-title:

Author:

Li Baolin¹^ORCID,Samsi Siddharth²^ORCID,Gadepally Vijay²^ORCID,Tiwari Devesh¹^ORCID

Affiliation:

1. Northeastern University, Boston, MA, USA

2. MIT Lincoln Laboratory, Lexington, MA, USA

Funder

United States Air Force Research Laboratory

Assistant Secretary of Defense for Research and Engineering

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3588195.3592997

Reference79 articles.

1. Chengliang Zhang , Minchen Yu , Wei Wang , and Feng Yan . Mark: Exploiting cloud services for cost-effective, slo-aware machine learning inference serving. In 2019 {USENIX} Annual Technical Conference ({USENIX} {ATC} 19) , pages 1049 -- 1062 , 2019 . Chengliang Zhang, Minchen Yu, Wei Wang, and Feng Yan. Mark: Exploiting cloud services for cost-effective, slo-aware machine learning inference serving. In 2019 {USENIX} Annual Technical Conference ({USENIX} {ATC} 19), pages 1049--1062, 2019.

2. The Architectural Implications of Facebook's DNN-Based Personalized Recommendation

3. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective

4. Daniel Crankshaw , XinWang, Guilio Zhou , Michael J Franklin , Joseph E Gonzalez , and Ion Stoica . Clipper: A low-latency online prediction serving system . In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17) , pages 613 -- 627 , 2017 . Daniel Crankshaw, XinWang, Guilio Zhou, Michael J Franklin, Joseph E Gonzalez, and Ion Stoica. Clipper: A low-latency online prediction serving system. In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17), pages 613--627, 2017.

5. InferLine

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Loki: A System for Serving ML Inference Pipelines with Hardware and Accuracy Scaling;Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing;2024-06-03

2. Flexible Deployment of Machine Learning Inference Pipelines in the Cloud–Edge–IoT Continuum;Electronics;2024-05-11

3. Deep Learning Workload Scheduling in GPU Datacenters: A Survey;ACM Computing Surveys;2024-01-22

4. Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service;Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis;2023-11-11