1. Qin, L., Che, W., Li, Y., et al.: A stack-propagation framework with token-level intent detection for spoken language understanding. In Proceedings of EMNLP, Hong Kong, China, pp. 2078–2087 (2019)
2. Qin, L., Liu, T., Che, W., et al.: A co-interactive transformer for joint slot filling and intent detection. In Proceedings of ICASSP, Toronto, Canada, pp. 8193–8197 (2021)
3. Meryem, M., Kim, D.S., et al.: X-METRA-ADA: cross-lingual meta-transfer learning adaptation to natural language understanding and question answering. In Proceedings of NAACL, online, pp. 3617–3632 (2021)
4. Tang, R., Lu, Y., Liu, L., et al.: Distilling task-specific knowledge from bert into simple neural networks. arXiv preprint arXiv:1903.12136 (2019)
5. Zhou, W., Xu, C., McAuley, J.: BERT learns to teach: knowledge distillation with meta learning. In: Proceedings of ACL, Berlin, Ireland, pp. 7037–7049 (2022)