1. Somashekaracharya G. Bhaskaracharya Julien Demouth and Vinod Grover. 2020. Automatic Kernel Generation for Volta Tensor Cores. arxiv:2006.12645 [cs.PL] Somashekaracharya G. Bhaskaracharya Julien Demouth and Vinod Grover. 2020. Automatic Kernel Generation for Volta Tensor Cores. arxiv:2006.12645 [cs.PL]
2. Optimizing matrix multiply using PHiPAC
3. Compiling affine loop nests for distributed-memory parallel architectures
4. Uday Bondhugula , Albert Hartono , J. Ramanujam , and P. Sadayappan . 2008. A Practical Automatic Polyhedral Parallelizer and Locality Optimizer . In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation ( Tucson, AZ, USA) (PLDI’08). ACM, New York, NY, USA, 101–113. https://doi.org/10.1145/1375581.1375595 10.1145/1375581.1375595 Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. 2008. A Practical Automatic Polyhedral Parallelizer and Locality Optimizer. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation(Tucson, AZ, USA) (PLDI’08). ACM, New York, NY, USA, 101–113. https://doi.org/10.1145/1375581.1375595
5. 18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight