1. TOP500 Supercomputing Sites. http://www.top500.org/
2. Ma W, Krishnamoorthy S, Villa O, Kowalski K (2010) Acceleration of streamed tensor contraction expressions on GPGPU-based clusters. In: Proceedings of the 2010 IEEE international conference on cluster computing (Cluster’10)
3. Jacobsen DA, Thibault JC, Senocak I (2010) An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters. In: Proceedings of the 48th AIAA aerospace sciences meeting
4. Phillips EH, Fatica M (2010) Implementing the Himeno benchmark with CUDA on GPU clusters. In: Proceedings of the 24th IEEE international parallel and distributed processing symposium (IPDPS’10)
5. Choi JW, Singh A, Vuduc RW (2010) Model-driven autotuning of sparse matrix-vector multiply on GPUs. In: Proceedings of the 15th ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP’10), pp 115–126