1. Matrix Algebra on GPU and Multicore Architectures. Innovative Computing Laboratory, University of Tennessee,
http://icl.cs.utk.edu/magma/
2. Nvidia visual profiler,
http://developer.nvidia.com/nvidia-visual-profiler
3. Performance Application Programming Interface (PAPI). Innovative Computing Laboratory, University of Tennessee,
http://icl.cs.utk.edu/papi/
4. Datta, K., Williams, S., Volkov, V., Carter, J., Oliker, L., Shalf, J., Yelick, K.: Auto-tuning the 27-Point Stencil for Multicore. In: Proc. iWAPT 2009: The Fourth International Workshop on Automatic Performance Tuning (2009)
5. Glaskowsky, P.N.: nVidia’s Fermi: The first complete gpu computing architecture. Technical report (2009)