1. Tinybert: Distilling BERT for natural language understanding;jiao;CoRR,2019
2. Distilling task-specific knowledge from bert into simple neural networks;tang,2019
3. Distilbert a distilled version of bert: smaller faster cheaper and lighter;sanh;5th Workshop on Energy Efficient Machine Learning and Cognitive Computing,2019
4. Knowledge distillation from internal representations;aguilar,2019
5. Improving multi-task deep neural networks via knowledge distillation for natural language understanding;liu,2019