1. A.M. Aji, W.C. Feng, Accelerating data-serial applications on GPGPUs: a systems approach, Tech. Rep. TR-08-24, Blacksburg, VA, 2008.
2. AMD accelerated parallel processing SDK, June 2011. http://developer.amd.com/sdks/AMDAPPSDK/.
3. D. Bailey, et al. The NAS parallel benchmarks, Tech. Rep. RNR-94-007, Moffet Field, CA, 1994.
4. CUDA Toolkit 4.0, May 2011. http://developer.nvidia.com/cuda-toolkit-40.
5. P. Du, R. Weber, P. Luszczek, S. Tomov, G. Peterson, J. Dongarra, From CUDA to OpenCL: towards a performance-portable solution for multi-platform GPU programming, Tech. Rep. UT-CS-10-656, Knoxville, TN, 2010.