1. Barton C, Casçaval C, Almási G, Zheng Y, Farreras M, Chatterjee S, Amaral JN (2006) Shared memory programming for large scale machines. In: Proc ACM SIGPLAN conf on programming language design and implementation (PLDI’06), Ottawa, Canada, pp 108–117
2. Bell C, Nishtala R (2004) UPC implementation of the sparse triangular solve and NAS FT. Last visit: April 2012. http://www.cs.berkeley.edu/~rajeshn/pubs/bell_nishtala_spts_ft.pdf
3. Bell C, Bonachea D, Nishtala R, Yelick K (2006) Optimizing bandwidth limited problems using one-sided communication and overlap. In: Proc 20th intl parallel and distributed processing symp (IPDPS’06), Rhodes Island, Greece
4. Buluç A, Gilbert JR (2008) Challenges and advances in parallel sparse matrix-matrix multiplication. In: Proc 37th intl conf on parallel processing (ICPP’08), Portland, OR, USA, pp 503–510
5. Dongarra J (2000) Templates for the solution of algebraic eigenvalue problems: a practical guide. SIAM, Philadelphia, Chap 10