1. [n. d.]. CuBLAS: Basic Linear Algebra on NVIDIA GPUs. https://developer.nvidia.com/cublas.
2. [n. d.]. Intel PMU profiling tools. https://github.com/andikleen/pmu-tools.
3. [n. d.]. Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN). https://github.com/rsdubtso/mkl-dnn.
4. [n. d.]. The Performance Application Programming Interface (PAPI). https://tvm.apache.org/docs/how_to/profile/papi.html.
5. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16). 265–283.