1. Performance, Design, and Autotuning of Batched GEMM for GPUs
2. R. Ballester-Ripoll E. G. Paredes and R. Pajarola. 2017. Sobol Tensor Trains for Global Sensitivity Analysis. ArXiv e-prints (Dec. 2017). arXiv:1712.00233 R. Ballester-Ripoll E. G. Paredes and R. Pajarola. 2017. Sobol Tensor Trains for Global Sensitivity Analysis. ArXiv e-prints (Dec. 2017). arXiv:1712.00233
3. The future of microprocessors
4. Sharan Chetlur Cliff Woolley Philippe Vandermersch Jonathan Cohen John Tran Bryan Catanzaro and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. (2014). Sharan Chetlur Cliff Woolley Philippe Vandermersch Jonathan Cohen John Tran Bryan Catanzaro and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. (2014).
5. Machine Learning Based Auto-Tuning for Enhanced OpenCL Performance Portability