1. Hypergraph Partitioning Based Models and Methods for Exploiting Cache Locality in Sparse Matrix-Vector Multiplication
2. M. Amaral , J. Polo , D. Carrera , S. Seelam , and M. Steinder . 2017 . Topology-Aware GPU Scheduling for Learning Workloads in Cloud Environments. In SC '17: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. M. Amaral, J. Polo, D. Carrera, S. Seelam, and M. Steinder. 2017. Topology-Aware GPU Scheduling for Learning Workloads in Cloud Environments. In SC '17: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis.
3. Asynchronous Iterative Algorithms with Flexible Communication for Nonlinear Network Flow Problems
4. A. Bhatele , and L. V. Kalé G. R. Gupta , and I. Chung . 2010. Automated mapping of regular communication graphs on mesh interconnects . In International Conference on High Performance Computing. A. Bhatele, and L. V. Kalé G. R. Gupta, and I. Chung. 2010. Automated mapping of regular communication graphs on mesh interconnects. In International Conference on High Performance Computing.
5. A. Bhatele , N. Jain , K. Isaacs , R. Buch , T. Gamblin , S. Langer , and L. Kale . 2014. Optimizing the performance of parallel applications on a 5D torus via task mapping . In International Conference on High Performance Computing. A. Bhatele, N. Jain, K. Isaacs, R. Buch, T. Gamblin, S. Langer, and L. Kale. 2014. Optimizing the performance of parallel applications on a 5D torus via task mapping. In International Conference on High Performance Computing.