1. 2017. TorchVision. https://pytorch.org/vision. 2017. TorchVision. https://pytorch.org/vision.
2. 2018. NVIDIA Triton. https://developer.nvidia.com/nvidia-triton-inference-server. 2018. NVIDIA Triton. https://developer.nvidia.com/nvidia-triton-inference-server.
3. 2020. TorchServe. https://pytorch.org/serve. 2020. TorchServe. https://pytorch.org/serve.
4. 2021. NVIDIA CUDA Toolkit Documentation (v11.3.0). https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/. 2021. NVIDIA CUDA Toolkit Documentation (v11.3.0). https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/.
5. Martín Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek G. Murray , Benoit Steiner , Paul Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI).