1. Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: Hpctoolkit: tools for performance analysis of optimized parallel programs. Concurr. Comput.: Pract. Exper., 22(6):685–701, April 2010 http://hpctoolkit.org
2. Broquedis, F., Clet-Ortega, J., Moreaud, S., Furmento, N., Goglin, B., Mercier, G., Thibault, S., Namyst, R.: hwloc: a generic framework for managing hardware affinities in hpc applications. In IEEE, editor, PDP: The 18th Euromicro International Conference on Parallel, p. 2010. Distributed and Network-Based Computing, Pisa, Italy, February (2010)
3. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000)
4. Eschweiler, D., Wagner, M., Geimer, M., Knüpfer, A., Nagel, W.E., Wolf, F.: Open trace format 2—the next generation of scalable trace formats and support libraries. In: Proceedings of the International Conference on Parallel Computing (ParCo), Ghent, Belgium, August 30–September 2 2011, vol. 22 of Advances in Parallel Computing, pp. 481–490. IOS Press (2012)
5. Intel Corporation. Intel architecture instruction set extensions programming reference. https://software.intel.com/isa-extensions