1. Chengliang Zhang , Minchen Yu , Wei Wang , and Feng Yan . Mark: Exploiting cloud services for cost-effective, slo-aware machine learning inference serving. In 2019 {USENIX} Annual Technical Conference ({USENIX} {ATC} 19) , pages 1049 -- 1062 , 2019 . Chengliang Zhang, Minchen Yu, Wei Wang, and Feng Yan. Mark: Exploiting cloud services for cost-effective, slo-aware machine learning inference serving. In 2019 {USENIX} Annual Technical Conference ({USENIX} {ATC} 19), pages 1049--1062, 2019.
2. The Architectural Implications of Facebook's DNN-Based Personalized Recommendation
3. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective
4. Daniel Crankshaw , XinWang, Guilio Zhou , Michael J Franklin , Joseph E Gonzalez , and Ion Stoica . Clipper: A low-latency online prediction serving system . In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17) , pages 613 -- 627 , 2017 . Daniel Crankshaw, XinWang, Guilio Zhou, Michael J Franklin, Joseph E Gonzalez, and Ion Stoica. Clipper: A low-latency online prediction serving system. In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17), pages 613--627, 2017.
5. InferLine