1. Intel 64 and ia-32 architectures software developer’s manual, volume 2: Instruction set reference, a-z (2015).
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
. Accessed 22 July 2016
2. Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: Hpctoolkit: tools for performance analysis of optimized parallel programs. Concurrency Comput. Pract. Experience 22(6), 685–701 (2010)
3. Anderson, T.E., Lazowska, E.D.: Quartz: a tool for tuning parallel program performance. In: Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems. Citeseer (1990)
4. Böhme, D., Wolf, F., de Supinski, B.R., Schulz, M., Geimer, M.: Scalable critical-path based performance analysis. In: 2012 IEEE 26th International Parallel & Distributed Processing Symposium (IPDPS), pp. 1330–1340. IEEE (2012)
5. Curtsinger, C., Berger, E.D.: Coz: finding code that counts with causal profiling. In: Proceedings of the 25th Symposium on Operating Systems Principles, pp. 184–197. ACM (2015)