1. [n.d.]. GPU Pro Tip: CUDA 7 Streams Simplify Concurrency. https://developer.nvidia.com/blog/gpu-pro-tip-cuda-7-streams-simplify-concurrency/. Accessed: 2022-10-21.
2. Sebastian Baunsgaard, Sebastian Benjamin Wrede, and Pinar Tözün. 2020. Training for Speech Recognition on Coprocessors. In ADMS.
3. CuMAS
4. Patryk Chrabaszcz, Ilya Loshchilov, and Frank Hutter. 2017. A Down-sampled Variant of ImageNet as an Alternative to the CIFAR datasets. CoRR arXiv (2017).
5. Criteo. [n.d.]. Criteo 1TB Click Logs dataset. https://www.criteo.com/news/press-releases/2015/07/criteo-releases-industrys-largest-ever-dataset/.