1. AMD: APP Profiler. Online.
http://developer.amd.com/tools/AMDAPPProfiler
(2011)
2. Clearspeed: Visual Profiler. Online.
http://www.clearspeed.com/products/sdk_details.php
(2011)
3. Dietrich, R., Ilsche, T., Juckeland, G.: Non-intrusive performance analysis of parallel hardware accelerated applications on hybrid architectures. In: International Conference on Parallel Processing Workshops, San Diego, pp. 135–143 (2010). doi: 10.1109/icppw.2010.30
4. Farooqui, N., Kerr, A., Diamos, G., Yalamanchili, S., Schwan, K.: A framework for dynamically instrumenting gpu compute applications within gpu ocelot. In: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-4, pp. 9:1–9:9. ACM, New York, NY (2011). doi:
http://doi.acm.org/10.1145/1964179.1964192
.
http://doi.acm.org/10.1145/1964179.1964192
5. Fuerlinger, K., Wright, N.J., Skinner, D.: Comprehensive performance monitoring for gpu cluster systems. In: Proceedings of the 12th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC), in conjunction with IPDPS-11, Anchorage, AK (2011).
http://projekt17.pub.lab.nm.ifi.lmu.de/fuerling/research/pubs//FUERLINGER_2011_PDSEC.pdf