1. The Code Repo for PipeDream: Pipeline Parallelism for DNN Training,0
2. Efficient Transformers: A Survey;tay,2022
3. Stanford Question Answering Dataset v1.1,0
4. Delta: Dynamically optimizing gpu memory beyond tensor recomputation;tang,2022
5. Wikipedia Dataset,0