1. A parallel implementation of matrix multiplication and LU factorization on the IBM 3090;Agarwal Ramesh C;Proceedings of the IFIP WG,1988
2. LU factorization for accelerator-based systems
3. AMD. 2022. AMD cdna2 white-paper. https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD. 2022. AMD cdna2 white-paper. https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf
4. Optimized HPL for AMD GPU and multi-core CPU usage
5. Sebastien Bubeck , Varun Chandrasekaran , Ronen Eldan , Johannes Gehrke , Eric Horvitz , Ece Kamar , Peter Lee , Yin Tat Lee , Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, and Yi Zhang. 2023 . Sparks of Artificial General Intelligence: Early experiments with GPT- 4. arXiv:2303.12712v2 Sebastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, and Yi Zhang. 2023. Sparks of Artificial General Intelligence: Early experiments with GPT-4. arXiv:2303.12712v2