1. [1] NVIDIA. Nvidia tensor cores. https://www.nvidia.com/en-us/data-center/tensorcore, 2019.
2. [2] Google. Google cloud tpu. https://cloud.google.com/tpu.
3. [3] Toshio Yoshida. Fujitsu high performance cpu for the post-k computer. In Hot Chips, volume 30, 2018.
4. [4] Denis Vanderstraeten. A stable and efficient parallel block gram-schmidt algorithm. Lecture Notes in Computer Science, 1685:1128–1135, 1999.
5. [5] NVIDIA. kmeans. https://github.com/NVIDIA/kmeans, 2020.