1. A BF16 FMA is All You Need for DNN Training
2. Dhiraj D. Kalamkar and et al. A study of bfloat16 for deep learning training. ArXiv, abs/1905.12322 , 2019 . Dhiraj D. Kalamkar and et al. A study of bfloat16 for deep learning training. ArXiv, abs/1905.12322, 2019.
3. Neil Burgess and et al. Bfloat16 processing for neural networks . 2019 IEEE 26th Symposium on Computer Arithmetic (ARITH) , pages 88 -- 91 , 2019 . Neil Burgess and et al. Bfloat16 processing for neural networks. 2019 IEEE 26th Symposium on Computer Arithmetic (ARITH), pages 88--91, 2019.
4. Mach and et al. Fpnew: An open-source multiformat floating-point unit architecture for energy-proportional transprecision computing;Stefan;IEEE Transactions on Very Large Scale Integration (VLSI) Systems,2020
5. Luca Bertaccini and et al. MiniFloat-NN and ExSdotp: An ISA extension and a modular open hardware unit for low-precision training on RISC-V cores . 2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) , pages 1 -- 8 , 2022 . Luca Bertaccini and et al. MiniFloat-NN and ExSdotp: An ISA extension and a modular open hardware unit for low-precision training on RISC-V cores. 2022 IEEE 29th Symposium on Computer Arithmetic (ARITH), pages 1--8, 2022.