1. Algorithm 287: matrix triangulation with integer arithmetic [F1];Blankinship;Commun. ACM,1966
2. Algorithms for Hermite and Smith normal matrices and linear diophantine equations;Bradley;Math. Comput.,1971
3. R.P. Brent, Parallel algorithms in linear algebra, in: Proc. Second NEC Research Symposium on Algorithms and Architectures, 1993, pp. 54–72.
4. S. Chatterjee, S. Sen, Cache-efficient matrix transposition, in: Proc. Sixth International Symposium on High-Performance Computer Architecture, IEEE Computer Society, 2000, pp. 195–205.
5. Parallel matrix transpose algorithms on distributed memory concurrent computers;Choi;Parallel Comput.,1995