1. [Akçay 20] M. B. Akçay and K. Oğuz: Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers, Speech Communication, Vol. 116, pp. 56–76 (2020)
2. [Boholm 11] M. Boholm and G. Lindblad: Head movements and prosody in multimodal feedback, in Proc. NEALT Proceedings Series: 3rd Nordic Symposium on Multimodal Communication, Vol. 15, pp. 25–32, Citeseer (2011)
3. [Bothe 18] C. Bothe, C. Weber, S. Magg, and S. Wermter,: A Context-based approach for dialogue act recognition using simple recurrent neural networks, in Proc. 11th Int. Conf. Language Resources and Evaluation (LREC 2018), pp. 1952–1957, Miyazaki, Japan (2018)
4. [Bunt 12] H. Bunt, J. Alexandersson, J.-W. Choe, A. C. Fang, K. Hasida, V. Petukhova, A. Popescu-Belis, and D. R. Traum: ISO 24617-2: A semantically-based standard for dialogue annotation, in Proc. 8th Int. Conf. Language Resources and Evaluation (LREC 2012), pp. 430–437, Istanbul, Turkey (2012)
5. [Chen 16] T. Chen and C. Guestrin: XGBoost: A scalable tree boosting system, in Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pp. 785–794, New York, NY, USA (2016)