1. KAUST BLAS.
http://ecrc.kaust.edu.sa/Pages/Res-kblas.aspx
2. Abdelfattah, A., Keyes, D., Ltaief, H.: KBLAS: an optimized library for dense matrix-vector multiplication on GPU accelerators. ACM Trans. Math. Softw. (accepted subject to revision) (2014).
http://arxiv.org/abs/1410.1726
3. Antz, H., Tomov, S., Dongarra, J.: Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-
$$\sigma $$
σ
formats on NVIDIA GPUs. Technical report (2014).
http://www.icl.utk.edu/sites/icl/files/publications/2014/icl-utk-772-2014.pdf
4. Ashari, A., Sedaghati, N., Eisenlohr, J., Parthasarathy, S., Sadayappan, P.: Fast sparse matrix-vector multiplication on GPUs for graph applications. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2014, pp. 781–792 (2014).
http://dx.doi.org/10.1109/SC.2014.69
5. Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc Web page (2014).
http://www.mcs.anl.gov/petsc