1. K. Datta, M. Murphy, V. Volkov, S. Williams, and J. Carter, “Stencil computations on multicore architectures,” ACM Transactions on Architecture and Code Optimization, vol. 5, no. 3, 2008.
2. P. Gupta, M. T., M. Purushotham, S. L. J., V. N. R., and S. Nanda, “Efficient compiler design for a geometric shape domain-specific language: Emphasizing abstraction and optimization techniques,” EAI Endorsed Transactions on Scalable Information Systems, 2024.
3. L. Sun, C. Tang, Y. Jiang, X. Lian, and J. Guo, “A comprehensive survey on matrix multiplication optimization techniques for GPU,” Journal of Systems Architecture, vol. 117, p. 102097, 2021.
4. W. Shao, J. Zhang, W. Jiang, and X. Song, “Design and optimization of a matrix multiplication module for a ray tracing processor,” Journal of Systems Architecture, vol. 96, pp. 1–12, 2019.
5. P. Gupta, L. Y. Kumar, S. J. V. V. M. S. D., D. C. Kumar, and M. M. V. Chalapathi, “Design of efficient programming language with lexer using $-prefixed identifier,” EAI Endorsed Transactions on Scalable Information Systems, vol. 11, no. 2, 2024.