1. A cache oblivious algorithm for matrix multiplication based on Peano's space filling curve;Bader,2006
2. Improving matrix-based dynamic programming on massively parallel accelerators;Bednárek;Inf. Syst.,2017
3. Parallel Programming in OpenMP;Chandra,2001
4. Yinyang k-means: a drop-in replacement of the classic k-means with consistent speedup;Ding,2015
5. CUDA Application Design and Development;Farber,2011