1. Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of Collective Communication Operations in MPICH. International Journal of High Performance Computing Applications 19(1), 49–66 (2005)
2. Faraj, A., Yuan, X., Lowenthal, D.: STAR-MPI: Self Tuned Adaptive Routines for MPI Collective Operations. In: Proceedings of the 20th Annual International Conference on Supercomputing, ICS, pp. 199–208. ACM, New York (2006)
3. Kielmann, T., Hofman, R.F.H., Bal, H.E., Plaat, A., Bhoedjang, R.A.F.: MagPIe: MPI’s Collective Communication Operations for Clustered Wide Area Systems. In: Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pp. 131–140. ACM, New York (1999)
4. Miller, S., Kendall, R.: Implementing Optimized MPI Collective Communication Routines on the IBM BlueGene/L Supercomputer. Technical report, Iowa State University (2005)
5. Gabriel, E., Huang, S.: Runtime Optimization of Application Level Communication Patterns. In: International Parallel & Distributed Processing Symposium, IPDPS, pp. 1–8. IEEE (2007)