1. The single server queue in heavy traffic;Kingman,1961
2. An approximate formula for waiting time in single server queues;Marchal;AIIE Trans.,1976
3. Prema: A predictive multi-task scheduling algorithm for preemptible neural processing units;Choi,2020
4. Dataflow mirroring: Architectural support for highly efficient fine-grained spatial multitasking on systolic-array NPUs;Lee,2021
5. Ten lessons from three generations shaped Google’s TPUv4i: Industrial product;Jouppi,2021