1. AMD: Software optimization guide for AMD family 15h processors (2012).
2. Aulwes, R., Daniel, D., Desai, N., Graham, R., Risinger, L., Taylor, M., Woodall, T., Sukalski, M.: Architecture of LA-MPI, a network-fault-tolerant MPI. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium, p. 15 (2004).
3. Blagojević, F., Hargrove, P., Iancu, C., Yelick, K.: Hybrid PGAS runtime support for multicore nodes. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model PGAS ’10, pp. 3:1–3:10. ACM (2010).
4. Broquedis, F., Clet-Ortega, J., Moreaud, S., Furmento, N., Goglin, B., Mercier, G., Thibault, S., Namyst, R.: hwloc: A generic framework for managing hardware affinities in HPC applications. In: Proceedings of the 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing PDP ’10, pp. 180–186. IEEE Computer Society (2010).
5. Feind, K., McMahon, K.: An ultrahigh performance MPI implementation on SGI ccNUMA Altix systems. Comput. Methods Sci. Technol., 67–70 (2006).