1. A survey of quantization methods for efficient neural network inference;Gholami,2022
2. Microscaling data formats for deep learning;Darvish Rouhani,2023
3. Shared microexponents: A little shifting goes a long way;Rouhani,2023
4. Training dnns with hybrid block floating point;Drumond;Adv. Neural Inf. Process. Syst.,2018
5. Pushing the limits of narrow precision inferencing at cloud scale with microsoft floating point;Darvish Rouhani;Adv. Neural Inf. Process. Syst.,2020