1. Rohan Anil , Gabriel Pereyra , Alexandre Passos , Robert Ormandi , George E Dahl , and Geoffrey E Hinton . 2018. Large scale distributed neural network training through online distillation. arXiv preprint arXiv:1804.03235 ( 2018 ). Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E Dahl, and Geoffrey E Hinton. 2018. Large scale distributed neural network training through online distillation. arXiv preprint arXiv:1804.03235 (2018).
2. ItemSage: Learning Product Embeddings for Shopping Recommendations at Pinterest
3. Online Knowledge Distillation with Diverse Peers
4. Jin Chen , Defu Lian , Yucheng Li , Baoyun Wang , Kai Zheng , and Enhong Chen . 2022. Cache-Augmented Inbatch Importance Resampling for Training Recommender Retriever. arXiv preprint arXiv:2205.14859 ( 2022 ). Jin Chen, Defu Lian, Yucheng Li, Baoyun Wang, Kai Zheng, and Enhong Chen. 2022. Cache-Augmented Inbatch Importance Resampling for Training Recommender Retriever. arXiv preprint arXiv:2205.14859 (2022).
5. Revisiting Pre-Trained Models for Chinese Natural Language Processing