1. [Brown 20] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., and Amodei, D.: Language models are few-shot learners, in Advances in Neural Information Processing Systems, Vol. 33, pp. 1877–1901 (2020)
2. [Chang 22] Chang, S.-Y., Li, B., Sainath, T., Zhang, C., Strohman, T., Liang, Q., and He, Y.: Turn-taking prediction for natural conversational speech, in Proceedings of Interspeech, pp. 1821–1825 (2022)
3. [Clive 22] Clive, J., Cao, K., and Rei, M.: Control prefixes for parameter-efficient text generation, in Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), pp. 363–382 (2022)
4. [Devlin 19] Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding, in Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Hu- man Language Technologies, Volume 1, pp. 4171–4186 (2019)
5. [Ekstedt 20] Ekstedt, E. and Skantze, G.: TurnGPT: a transformer-based language model for predicting turn-taking in spoken dialog, in Findings of the Association for Computational Linguistics: EMNLP, pp. 2981–2990 (2020)