1. Abdel-Shafi, H., Hall, J., Adve, S.V., Adve, V.S.: An evaluation of fine-grain producer-initiated communication in cache-coherent multiprocessors. In: HPCA’97: Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture, p. 204. USA, IEEE Computer Society, Washington, DC (1997)
2. Amarasinghe S.P., Gordon M.I., Karczmarek M., Lin J., Maze D., Rabbah R.M., Thies W.: Language and compiler design for streaming applications. Int. J. Parallel Program. 33(2–3), 261–278 (2005)
3. Bronevetsky, G., Gyllenhaal, J., de Supinski, B.R.: CLOMP: accurately characterizing OpenMP application overheads. In: Proceedings of the Fourth International Workshop on OpenMP (IWOMP), pp. 13–25. West Lafayette, IN (May 2008)
4. Cook, H., Asanović, K., Patterson, D.A.: Virtual local stores: enabling software-managed memory hierarchies in mainstream computing environments. Technical Report UCB/EECS-2009-131, EECS Department, University of California, Berkeley (Sep 2009)
5. Falsafi, B., Lebeck, A.R., Reinhardt, S.K., Schoinas, I., Hill, M.D., Larus, J.R., Rogers, A., Wood, D.A.: Application-specific protocols for user-level shared memory. In: Supercomputing ’94: Proceedings of the 1994 Conference on Supercomputing, pp. 380–389. IEEE Computer Society Press, Los Alamitos, CA, USA (1994)