1. Aggarwal A, Vitter J (1988) The input/output complexity of sorting and related problems. Commun ACM 31:1116–1127
2. Bader DA, Moret BME, Sanders P (2002) Algorithm engineering for parallel computation. In: Fleischer R, Meineche-Schmidt E, Moret BME (eds) Experimental algorithmics. Lecture notes in computer science, vol 2547. Springer, Berlin, pp 1–23
3. Blelloch GE, Leiserson CE, Maggs BM, Plaxton CG, Smith SJ, Zagha M (1998) An experimental analysis of parallel sorting algorithms. Theory Comput Syst 31(2):135–167
4. Choi J, Dongarra JJ, Pozo R, Walker DW (1992) ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers. In: The 4th symposium on the frontiers of massively parallel computations. McLean, pp 120–127
5. Culler DE, Karp RM, Patterson DA, Sahay A, Schauser KE, Santos E, Subramonian R, von Eicken T (1993) LogP: towards a realistic model of parallel computation. In: 4th symposium on principles and practice of parallel programming. ACM SIGPLAN, pp 1–12