1. Jiang, B., et al.: Dcaf: a dynamic computation allocation framework for online serving system. arXiv preprint arXiv:2006.09684 (2020)
2. Covington, P., Adams, J., Sargin, E.: Deep neural networks for Youtube recommendations. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 191–198 (2016)
3. Crankshaw, D., Wang, X., Zhou, G., Franklin, M.J., Gonzalez, J.E., Stoica, I.: Clipper: a low-latency online prediction serving system. In: NSDI, vol. 17, pp. 613–627 (2017)
4. Dean, J., et al.: Large scale distributed deep networks. Advances in neural information processing systems 25 (2012)
5. Grbovic, M., Cheng, H.: Real-time personalization using embeddings for search ranking at airbnb. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 311–320 (2018)